Computer Networks.
Sergey Gorbunov
Section 002
Email: sgorbunov@uwaterloo.ca
Office Hours: Mondays 11am-12pm
Mondays/Wednesdays 2:30pm-3:50pm
Course content on LEARN and Piazza. Three assignments, worth 40% overall. 20% midterm and 40% final.
Textbook: computer networking, a top-down approach, 7th edition.
The internet can be thought of as a network of networks, such as mobile networks, regional ISPs, and home networks. The internet consists of connected computers known as hosts, running network apps that communicate over communication links like fiber/copper/satellite. To implement these communications, we use packet switches like routers and switches to forward packets to the right places.
A protocol is an agreed-upon standard for communication, defining the format, order, and actions to take upon communications. An access network connected end users to the rest of the internet, such as DSL (internet over phone lines) and cable (many channels of data encoded in one wire on different frequencies - frequency division multiplexing). Devices in a network have to share access to the cable/DSL modem, using a network switch, or more commonly, a router.
Hosts send packets of data, by breaking up the data into units of length L. The packets are transmitted at the transmission rate R, determined by the link capacity (bandwidth). At the core of a network is a mesh of interconnected routers, making sure every packet gets to the right place. When we visit a website, our local router checks whether the website is on the local network, and if not, forwards it to the router's default gateway, which is the next network along the way to the destination. Every router has a default gateway, which is where the packet should get forwarded to if the router doesn't know what to do with it explicitly.
Routers receive entire packets and store them before processing them and transmitting them to the next link - store and forward. This ensures that we can do error correction properly on the packet, and so that we have all of the packet metadata so we can implement traffic shaping. Since it takes L/R time to transmit/receive a packet, it takes 2L/R time to receive and then retransmit a packet, assuming no propagation delay. To implement store-and-forward, it's generally necessary to queue up incoming packets, and drop packets if the queue fills up.
The key functionality of the network core is to route (determine the path from source to destination using routing algorithms) and forwarding (move packets from the input of the router to the correct output). Other networks use circuit switching, where there are reserved routes between each node that are not shared. This is less efficient but has much better guarantees for things like bandwidth and latency - we don't have issues with congestion, packet loss, or delays like with packet switching, but we're wasting resources when the network isn't at capacity. Circuit switching can use either frequency division multiplexing (channels send data on different frequencies) or time division multiplexing (channels take turns sending data on all frequencies, sort of like packet switching).
Circuit switching vs. packet switching is more of a reserved resources vs. on-demand resources distinction. For applications like audio/video, we'd like to get circuit-like properties over a packet network - this is still an unsolved problem.
Assignment 1 posted, due October 2, writing a simple socket program.
End users usually connect to the internet via access networks provided by access ISPs, and then the millions of access ISPs are connected to each other. ISPs can be connected to each other in a ring, but tihs has reliability issues, makes routes between hosts very long, and doesn't scale up, since the amount of traffic each ISP would have to handle increases linearly with respect to the number of ISPs. ISPs could all be fully connected with each other, but this is impractical, since this would require millions of connections per ISP.
What we end up doing is to use higher-tier ISPs that each connect many access ISPs. These higher-tier ISPs can then connect to each other via peering links, and they can agree to forward each other's packets. Internet exchange points (IXPs) control peering links to make sure their resources are being used efficiently. Due to the tiering, the internet can be considered a network of networks. There are also many tiers in between global and access ISPs, like regional networks and content provider networks (like Google CDN, Akamai). So at the center, we have a small number of well-connected Tier 1 ISPs (there's about 7 of them now, like Level3, Sprint, AT&T), then content providers like Google and Akamai connect to those.
Packet delay and loss usually occur at routers. Routers have to queue up packets when their incoming packet rate exceeds its available packet output capacity, and if the queue becomes full, the router has to start dropping packets from the queue, causing packet loss. Packet loss and delay also commonly happens due to bad connections, when they simply get lost in transmission or are delayed because we have to retransmit them. Specifically, routers induce queueing delay (time spent waiting in the incoming packet queue), transmission delay (time spent transmitting packet on output channel, a function of the packet size and link bandwidth), processing delay (time spent for router to determine where to send a packet), and propagation delay (time spent for packet to travel along the outgoing connection, a function of the link's physical length and the propagation speed, usually the speed of light).
The queueing delay depends on the average incoming packet rate a, the link bandwidth R, and the average packet length L. If La/R \le 1, the queueing delay is proportional to that value. Otherwise, queueing delay will tend to infinity, as the router can no longer keep up.
Real delays on the internet can be inspected using tools like traceroute
. Traceroute uses ICMP to find individual delays from the source to each router along the way to a destination. Essentially, for each router i along the path, we send three ICMP packets that will be detected specifically by router i along the path, which then responds to the ping. Traceroute then times how long it took for the three replies to come back, then shows the delays and the hosts at which our packets were dropped. The packet won't always take the same path, but it can give us a general idea of what routing delays are like.
Packets what's called a TTL (time to live), which determine the number of times the packet can pass through a router. This ensures that if we end up with a cyclic route, packets don't go back and forth forever - instead, packets get their TTL decremented each time they're retransmitted, and are dropped if their TTL reaches 0. For ICMP, we can also request that the router that dropped the packet notify us via ICMP that they dropped our packet. The TTL is a hop count rather than a real-world time because real-world time doesn't really map to how much congestion a packet might cause, only how much it clogs up routers does.
To make sure the traceroute packets go specifically to router i and not any other routers, we set the ICMP packet's TTL to i and request that we get notified if the packet is dropped. After i hops, the packet gets to router i and we get an ICMP packet back telling us that router i dropped it.
Throughput is the maximum transmitting/receiving rate. Usually this is a bottleneck at access ISP and server-to-ISP connections, rather than in things like peering links - the lowest throughput link in a route determines the overall throughput of the entire route.
The internet is organized into many layers, to allow different components to operate at simple levels of abstraction. The 7-layer model is the main way to look at these:
Generally, lower layers will add headers/footers to the higher layers' units during transportation, and then strip them out when they arrive.
The internet wasn't originally designed with much security, assuming mutually trusting users on a network - a lot of work has taken place since to make sure the internet still works today.
Hosts work on the application layer and above (most OSs expose networking on the transport-layer-level), switches work on the link layer, and routers work on the network layer.
Network applications are those that run on end systems (not network core devices), and communicate over networks as part of their core functionality. For example, web browsers, email clients, and text messaging apps.
The client-server architeture is probably the most common one in use today. Servers are always-on hosts that generally live in datacenters and have static IP addresses. Clients might be intermittently connected, use dynamic IPs, and initiate connections to servers. Another common architecture is P2P, where there's no centralized servers and arbitrary end systems can directly communicate, like with BitTorrent.
Processes are programs running on hosts. Processes on the same host communicate using IPC, and processes on different hosts exchange messages instead, usually using sockets. Clients are the initiators of communication, while servers wait to be contacted. A host is identified by its IP address and port.
Most commonly, network apps need data integrity, timing, throughput, and security.
TCP provides reliable transport (same order for sending and receiving), flow control (sender won't overwhelm receiver), congestion control (serve won't overwhelm network). It's connection oriented and doesn't provide timing/throughput/security guarantees. In contrast, UDP doesn't provide any of the above, but is much faster and lower computational/network overhead. It's also packet-oriented. Application layer protocols like SMTP, Telnet, HTTP, or FTP are based on TCP, while others like HTTP Streaming, RTP, SIP might use UDP.
TCP can be secured by running SSL/TLS at the application layer on top of it. It also provides data integrity and endpoint authentication.
Web pages are made of objects, such as HTML files, PNGs, MP3s, which are addressible by their URL and might reference each other by URL. Objects are requested by a web browser (client), from a web server, using HTTP. The client initiates a TCP connection on port 80 (443 for SSL/TLS), and the server accepts the TCP connection and exchanges HTTP messages before closing the connection. HTTP is stateless, and by default fulfilling any request doesn't require us to know about previous requests.
For non-persistent HTTP, we need two round trips per request - one to request, one to respond. In persistent HTTP, the server leaves the connection open afterward, saving round trips in future requests.
Telnet opens a raw TCP connection to a host. For example, you can do telnet some_host 80
and then type in an HTTP GET request, and you'll get an HTTP response back.
User state in HTTP is usually done via cookies - special Cookie
headers in HTTP responses associate values with that host in the user's browser, and those values are sent back via the Cookie
header in future HTTP requests.
Demo with wireshark capturing HTTP messages over TCP, showing headers, application/transport/network layers.
Caching proxies are often used to shorten RTTs by being closer/on faster networks, and to reduce load on application servers. They access the page on your behalf, pass the responses back, and save them to return them directly in future requests, bypassing the application server.
Email works using user agents (e.g., Outlook, Thunderbird), which connect to mail servers. User agents communicate with mail servers using POP (authentication and download), IMAP (antuentication/download/on-server message manipulation), and HTTP (for webmail services like GMail), and mail servers communicate with each other using SMTP (simple mail transfer protocol), on top of TCP on port 25. SMTP is a plain text protocol much like HTTP, with 7-bit ASCII messages, CRLF EOL characters, and status codes/phrases in responses.
Just like with the web, you can do telnet some_host 25
and then type in SMTP requests, and you'll get an SMTP response back.
Overview of assignment 1.
The domain name system maps names like "google.com" to public IP addresses, which we can actually connect to. The public IP is translated to a private IP by a NAT, if present.
The DNS system is a hierarchical distributed database, implemented by DNS software running on name servers. DNS is an application layer protocol, allowing host to IP translation, canonical/alias hostnames, mail server aliasing, and load distribution (by mapping hosts to multiple possible IP addresses).
When a client wants the IP address for www.amazon.com, the client queries the root DNS server to find the IP of the .com DNS server, which it then queries for the IP of the amazon.com DNS server. The client then queries that server for the www.amazon.com DNS server for the IP of www.amazon.com, which it can then connect to. This is an iterative DNS query - there's also a recursive DNS query, where the client only queries one nameserver, which then in turn queries the child DNS servers on the client's behalf. This might be faster since the nameservers can be closer together, but results in very heavy load on the root DNS servers.
The IPs of th root DNS servers are usually hardcoded into most browsers. The root DNS servers are currently implemented by 13 logical servers, each highly replicated. Most of the time, we can skip common DNS queries, since most clients and DNS servers will have a DNS cache. The DNS cache stores DNS query results for a configurable amount of time, known as the DNS entry's TTL, to allow DNS changes to propagate in a timely fashion.
DNS is often attacked via DDoS on the root name servers, man-in-the-middle attacks, DNS poisoning (malicious upstream servers return malicious replies to downstream nameservers, which then cache the bad results for a long time). DNS is also used for traffic amplification for DDoS - DNS replies are larger than DNS queries, so attackers spoof the source IP of DNS requests as the attacker's target to get DNS servers to send more traffic to the target than the attackers could do by directly sending that much traffic to the target.
To distribute files to N users, a client-server architecture will take linear time with respect to N, since the central server has a limited upload/download speed, so it can only serve a fixed number of users at a time. In contrast, a P2P architecture takes logarithmic time, since each peer can propagate the file to k other peers that don't have a copy - the propagation rate increases exponentially, so the overall time taken is logarithmic, ignoring the time needed to compute .
BitTorrent is one P2P file distribution architecture. Peers in a BT network all have different chunks (subsets) of the file, and priodically asks peers for a list of chunks they have, and requesting chunks from peers in order from most-rare to least-rare. A peer will respond to requests from other peers in the order of how much the other peer has answered us - for example, a peer might only answer requests from the four peers that have answered the most reuqests from that peer, re-evaluating the top four every 10 seconds, and answering a random non-top-four peer every 30 seconds.
Video traffic is a significant fraction of traffic on the internet, over 50%. Video compression, video protocols, and CDNs are crucial for making sure this traffic can be handled. One important one is DASH = dynamic adaptive streaming over HTTP. The server divides the video into chunks, each of which are stored/encoded at several different bit rates, as different files. The server then makes a manifest of all the URLs of these chunks and sends it to the client. The client will regularly measure available server-client bandwidth, and chooses a chunk from the manifest that has a bitrate the connection can support, then downloads it. The client can switch at any time, whenever the video bandwidth is too high or low or when the remaining time in the current chunk is running low. The client can also choose which server to download from, and will usually measure multiple servers at a time to find the one with lowest latency/highest bandwidth.
CDNs can push content closer to access networks, like Akamai, or host content on huge, fast server clusters, like Limelight. They store many copies of the file on many computers to ensure there's always one that has a good connection to a given user.
UDP sockets allow us to unreliably send groups of bytes (datagrams) to other computers, with no connection required.
Overview of client/server code for making a request to a server using Python's socket
stdlib. Port scanning demo, and how to detect software versions by looking for version-specific variations in different protocols.
There are several common packet routing methods for a given network:
CDN/DNS providers like Cloudflare use these routing techniques to provide DDoS protection, such as using anycast to prevent any single server from getting overloaded. Whereas load balancers might be overpowered by a strong DDoS attack, hardware routers can handle a lot more.
DNS uses UDP because DNS requests and responses are usually short and fixed-length, and fit into a single UDP packet. In this application, TCP's three way handshake and ordering guarantees cause unnecessary overhead, so DNS implements its own retry and reliability mechanisms.
A blockchain is a distributed append-only database (there's inherently no single trusted node, and we can only append new entries to it, not change previous ones). Many modern blockchains also implement a cryptocurrency on top as an incentive system on top of that to encourage people to host the database and host it honestly.
Each node in the blockchain's network is either corrupt or honest, but we usually assume there must be at least some fraction of the nodes are honest (e.g., 50% + 1 nodes in Bitcoin). The honest fraction is used to ensure that the network as a whole can be trusted.
Quick overview of public key crypto and hashing.
In a Bitcoin network, a coin is a unique ID. Users are a public-private key pair (a verification-signing key pair). A transaction from user A to user B consists of a proof that user A holds a particular coin, the transaction sender/receiver, their verification keys, and the transaction amount. The transaction is then signed by the sender's secret signing key.
Every node generates and broadcasts a transaction to a lot of peers. Nodes collect all transactions they've received in the past. Every so often (around 10 minutes on average), nodes manage to generate blocks: a structure containing a collection of transactions, a hash of the previous block, and a proof of work (proof that it took a lot of work to generate this block). The rate of generation is limited by the proof of work.
Assignment 1 due monday.
The proof of work is based on changing fields in the block reserved for proof of work, until the entire block's hash starts with a given number of zeroes (this number is automatically adjusted by the network such that it will always take around 10 minutes to solve). Since a hash function is preimage resistant, we have to try a lot of hashes in order to find such fields. The resulting value of those fields is then a proof that we tried a lot of hashes to find one that starts with so many zeroes.
Since each block points to the previous one, we have a continuous chain of blocks, verified by hashes. The interesting property is that past blocks can't be modified, since that would change the hash, and all of the blocks made after it would be invalid - we can't change past blocks without recomputing all of the blocks that point to it. This prevents past transactions from being modified.
Nodes always choose the longest chain - the chain with the most work behind it. This ensures that anyone who wants to fake the chain must be able to put in more work than 50% of the network, in order to consistently have nodes choose its chain over the honest one.
Transport protocols try to establish a logical communication layer for apps on different hosts. They break up messages into segments, pass them to the network layer, reassemble them on the other side, and pass them to the app layer. This is analogously, if 12 kids in A's house are sending letters to 12 kids in B's house, the postal service is the network layer, and A/B muxing/demuxing letters is the transport layer.
TCP is responsible for interfacing with the network layer. One thing it needs to do is multiplex sockets with the link. When a host receives an IP datagram, there's source/destination IP addresses, the transport-layer header (which contains source/destination port numbers), and the payload. TCP then uses the detination port number to route the datagram to the right socket. UDP does something similar with UDP sockets.
In other words, TCP sockets are identified by source/destination IP and source/destination port, and IP datagrams are routed by these fields to the appropriate socket. This means each user can have a different socket to a server, because their source IP is different. For non-persistent HTTP, we can even have a different TCP socket for each request (since we close the connection after the request ends).
UDP packets have a checksum, which is the one's complement sum of 16-bit chunks of the the UDP segment (source/destination port, payload). This is useful for detecting when corruption occurs, though there's no way to correct it from this.
We want to simulate a reliable data channel (in-order, error-free data channel) using unreliable data channels, where it's possible for there to be corruption and packet losses. TCP does this by specifying the sender/receiver using finite state machines and techniques like checksums, sequence numbers, ACK messages, and retransmissions.
A check sum is used to detect corruption. A sequence number is used to handle duplicated transmissions (each packet either has sequence number 0 or 1, alternating between packets).
In the stop and wait protocol, the sender sends data, then waits a reasonable amount of time for an ACK, retransmitting if no ACK received. If the ACK packet is merely delayed rather than lost, we'll have sent a duplicate, but it's ignored due to the packet sequence number. The receiver will only send an ACK if it receives the data with a correct checksum. Basically, we send one packet at a time, waiting until we get an ACK or timeout - this has very poor performance.
;wip: missed due to interviews
;wip: missed due to interviews
;wip: missed due to interviews
;wip: missed due to interviews
;wip: missed due to interviews
;wip: missed due to interviews
Midterm Friday, 1.5 hours, covers everything up and including chapter 4 slide 49 in the slides posted online. Midterm review today for the first four chapters.
;wip
;wip
The link layer is only responsible for transferring datagrams between nodes that are physically adjacent. For example, Ethernet, frame relay, or 802.11. This is usually implemented in hardware (e.g., NICs, wireless chipsets), usually in network adapters and network cards, in order to ensure the functionality is fast enough, and because there might be hard realtime requirements for transmission. Additionally, some link layer services need to forward datagrams for other nodes in the network (such as for mesh networks), so there can actually be much more data being passed around at this layer than the layers above will see.
wThe link layer has to deal with network topologies, such as star, ring, and mesh.
The link layer deals with the upper layers by passing and receiving datagrams. Just like IP has IP addresses. the link layer uses MAC addresses to identify adjacent nodes in a network.
The link layer is responsible for flow control, error detection (by checksum), error correction (without retransmission), and setting full/half duplex modes.
The link layer performs error detection and correction by including EDC (error correction and detection) bits in datagrams. It may give false negatives (not catching errors) but will never give false positives (catching nonexistant errors). We balance the number of EDC bits with how much space those bits will take up.
The most common way to do error detection is the parity bit - for every d data bits, include a parity bit that is 1 if the number of 1's in the data bits was odd, and 0 otherwise - the parity bit always makes it so there are an even number of 1's. This allows us to detect, but not correct any errors. This can detect single bit errors, but not necessarily multiple bit errors.
The most common way to do both error detection and correction is the twi dimensional bit party - we write the bits of the data into a d by d matrix, then compute parity bits for each row and each column. Now, when we have a single bit error, one row and one column will fail the parity check - we can tell exactly which bit was flipped, and just flip it back. This allows us to detect single or double bit errors, and correct single bit errors, but no more than that.
Another common way to detect errors in higher levels (such as the transport layer) is the internet checksum, which is just the one's complement sum of every 16 bit chunk of the data. This is fast, has a low false negative rate in practice, and doesn't give false positives. We use this in the transport layers as well because upper layers such as IP can't rely on lower layers using good error correction - some link layer protocols don't have any error correction at all. Also, transport layer checksums can be implemented in software and can be more advanced/effective.
Another common checksum is CRC (cyclic redundancy check). For each d data bits D, we choose r bits R such that D \| R is a multuple of G, our CRC constant. ;wip: properties of CRC
Many physical links have many computers on the same link - a shared broadcast channel. The multiple access problem is the question of how to ensure multiple computers don't collide with each other (multiple trransmissions at the same time. We solve this with multiple access protocols.
The ideal multiple access protocol is to, given a shared broadcast channel with bandwidth R, M nodes that want to transmit can each transmit at R/M, and is fully decentralized (no special nodes, no global clock, no synchronization). While there are theoretical protocols that get pretty close to these properties, they tend to have too much overhead or are too hard to use in practice. In practice, there are three main approaches we use - channel partitioning, random access, and taking turns.
Channel partitioning protocols just splits up the channel for multiple transmitters. TDMA (time division multiple access) divides access to the broadcast channel into equal-sized timeslots, distributes them in a round-robin fashion between the nodes, and unused timeslots are left empty. This can be done in a decentralized fashion using token passing, but leaves a lot of bandwidth unused. FDMA (frequency division multiple access) is the same thing, but instead we divide frequencies on the broadcast channel and distributes them in a round robin fashion (with the same benefits and issues as TDMA).
Random access protocols allow any node to try and transmit at any time at full capacity, and then recover if we detect any collisions (two or more transmitting nodes). The differences between random access protocols are simply how we detect or recover from collisions. Slotted ALOHA is one such protocol - we break up the channel into equal sized timeslots, and nodes try to transmit whenever they want, and if there's a collision then each node retransmits in each subsequent timeslot with a certain probability until they've managed to broadcast without collisions. If we work out the probabilities as the number of nodes increases to infinity, it's about 37% efficient.
Take-turns protocols use mechanisms that make their nodes take turns. For example, polling has a master node that gives each node permission to broadcast, whenever the channel is free. This works, but has a lot of latency for the actual communication between nodes and the master nodes, and the master is a single point of failure. Another is the token-passing protocol, where the node having a token can transmit, and it passes the token to another node after completing.
In practice, we generally use CSMA/CD (carrier sense multiple access, collision detection), which is a random access protocol - listen to the channel first, and if someone is broadcasting, wait for them to finish. There can still be a collision due to propagation delay, where we don't detect a broadcast but still collide anyways, and the CD part handles that - when we detect a collision, we abort our transmission, send a jam signal (to help others detect collisions), and retry with exponential backoff (on the mth collision for this datagram, choose a random power of 2 between 2^0 and 2^m). CSMA/CD's efficiency approaches 100% as propagation delay approaches 0 and the max-size frame approaches infinity, according to the formula \frac{1}{1 + 5 t_{\text{propagation-delay}}/t_{\text{transmit-max-size-frame}}}. This is used in many protocols, such as Ethernet and 802.11.
One place we use multiple access protocols is for cable ISP networks - the ISP's downstream link sends data to subscribers, but subscribers must share an upstream link using a multiple access protocol.
;wip: everything up to chapter 6.6
Overview of midterm answers, including half an hour for midterm Q&A. Last day for re-marking today.
;wip: for networked applications, we usually don't specify the port of the client directly because we might want to run multiple instances of the same application on the same machine
The subnet mask tells us how many bits can be used to denote a host. For example, if we have 82.48.112.0/24, the first 24 of the bits are being used for the subnet, and the remaining 8 are used to identify the host. Therefore, on this subnet there are 256 possible hosts.
Suppose we have a laptop that we want to use to make a request to google.com when it's connected to the internet via an Ethernet cable to the router. The detailed process looks like this:
The rest of the course will focus on some of the applications of networks, such as wireless networking, multimedia, and network security.
;wip: missed due to projects
;wip: security - feistel ciphers (DES, AES)
Overview of symmetric and asymmetric crypto. Overview of RSA. See CO487 notes for all the gory details.
The group corresponding to mod n is \set{0, \ldots, n - 1}. We choose two large primes p, q that are about the same order of magnitude as N (the first bit in the 4096-bit numbers should be 1), and let n = pq and z = (p - 1)(q - 1). We choose e to be coprime with z (usually with guess-and-check, it shouldn't take many tries), and d such that ed = 1 \pmod z (usually by finding divisors of ez - 1 or similar).
We encrypt by doing c = m^e \pmod n and decrypt by doing m = c^d \pmod n. This works because m = (m^e \mod n)^d \mod n, since ed = 1 \pmod n and some consequences of Fermat's theorem: (m^e \mod n)^d \mod n = m^{ed} \mod n = m^{ed \mod z} \mod n = m^1 \mod n = m, as required. We can also sign by decrypting the plaintext and verify by encrypting the signature.
RSA is a lot slower than symmetric key algorithms like AES because it requires a lot of modular exponentiation, which is expensive compared to things like AES's S-boxes, which are essentially just lookup tables (hundreds of times slower when both are implemented in software, generally). Generally we use public key crypto to exchange a symmetric key securely, then use that as the session key.
Signatures alone don't solve authentication. To prevent playback/replay attacks, we also need to use a nonce. For Alice to authenticate with Bob, Bob sends Alice a nonce (which will only ever be used once), and then Alice sends Bob that nonce after signing it. On each attempt, the nonce will be different, which prevents previous authentication messages from being reused. Additionally, to prevent man-in-the-middle attacks, we need some out-of-band way to exchange public keys, usually something like a certificate authority or web of trust or similar.
On the transport layer, we have SSL/TLS over TCP. Over the network layer, we have IPsec, which implements encryption for IP datagram payloads. This is often used for things like VPNs, where traffic may be routed over the open internet.
Firewalls are another common network security measure - they allow or block incoming/outgoing packets based on a set of filtering rules for things like source IP, port, and transport protocol. This helps us prevent DOS attacks and access to privileged network services that should only be visible to internal users. Stateless firewalls operate at the network layer level (block/allow individual packets), and can do things like trying to TCP ACK with no open connection. Stateful firewalls are more powerful, operating at the trasport layer level (block/allow individual sessions), and can do things like timing out inactive TCP connections (usually, this is implemented as an FSM). Application gateways are yet more powerful, operating at the application layer (individual applications), and can do things like allowing one webserver but not another. Firewalls are useful but usually can't do authentication - people can still just spoof the IP address and any other fields they need to.
Another useful tool is an intrusion detection system, which performs deep packet inspection to look for known threats/malware/exploits and drop the packets or notify people appropriately. They can also detect things like port scanning, DOS attacks, and network mapping.
;wip
;wip