Exploring Text-Based Protocols

Joel Berger

Presentation URL

https://jberger.github.io/TextBasedProtocols

Source and Materials

https://github.com/jberger/TextBasedProtocols

We use really high-level abstractions

For Example

  • Chat clients are GUIs
  • Browsers hide the HTML let alone the HTTP
  • Caching layers are provided by
    • web framework
    • cache abstraction library
    • cache engine language bindings

Many of these ...

  • Are human-readable(-ish)
  • Use simple transmission mechanisms (TCP)
  • Useful to know how the abstractions work

Transmission Control Protocol (TCP)

  • Backbone of network communications
  • Ordered
  • Reliable
  • Error checked
  • Bi-directional

Streams! Not Framed!

  • Send bytes
  • Read bytes
    • Message may not be complete
  • Note: WebSocket IS framed

When is the message complete?

  • when the connection closes
    • expensive
  • end at a known boundary (usually a newline)
    • body containing boundary symbol?
  • after a pre-agreed number of bytes
  • combination of these

A Note About New Lines

  • Some protocols require \n others require \r\n
  • Some protocols/servers/clients are more tolerant of the wrong ending than others
  • Check how to send a literal \r using
    $ stty -a
    and looking for lnext. It is ^V for me.

TCP Clients

telnet

  • venerable client
  • nuisance
  • not always installed anymore
  • always sends \r\n line endings
$ telnet localhost 6379

netcat (nc)

  • simple
  • attaches stdin/stdout
  • useful for pipes
  • some versions allow -c or -C to send \r\n
$ echo "get foo" | nc -C localhost 11211 > output.txt 

socat

  • like netcat but more features
  • especially useful for ssl
  • add options to each direction joined with commas
    • crlf
    • verify=0
$ socat - OPENSSL:duckduckgo.com:443

My wife wants you to know that it is "tacos" backwards

Others

  • $ openssl s_client -connect www.google.com:443
  • Most languages have a TCP client available
  • Mojo::IOLoop::Client

Examples

  • Graphite
  • IRC
  • Memcached
  • HTTP
  • Redis (cut for time)

Playing Around

  • During TPRCiC: live on jberger.pl
  • After the Conference, use docker ...

Included Docker-Compose File

Purpose of Examples

  • Show different types of protocols
  • Get comfortable using tools and methods
  • Give examples, not details

Sending Data

Example: Graphite

Graphite

  • high performance metric collector
  • takes metrics via simple TCP "line protocol"
  • no response
  • visualizations are from another web service
  • uses \n line endings

Graphite Plaintext Format

metric.path.name value timestamp\n

Example:

wx.temp.chicago 78.2 1621913119\n

Sending Data

$ echo "wx.temp.chicago 78.2 `date +%s`" | nc localhost 2003

Sending and Receiving Data

Example: IRC

Internet Relay Chat

  • venerable chat protocol
  • mostly human readable
  • bi-directional line protocol
    • NOT request/response
    • server sends messages you need
  • technically \r\n, most servers accept \n

Message Format

:prefix command arg1 arg2 ... argN\r\n
  • prefix is optional, not used by clients
  • space-separated arguments (15 max)
  • max message length 512 bytes including ending
  • newlines in arguments are prohibited
  • spaces in args are prohibited, exception
    • trailing argument with a leading :

Connect

IRC is noisy, separate input from output

In one terminal do


                touch irc-out; tail -f irc-out
              

In another terminal do

You'll see welcome messages in the other terminal

Alive Check: PING/PONG

If the server sends you

PING :something

You need to (rather promptly) reply

PONG :something
... or it will drop your connection

Join a Channel

Send

JOIN #test

Server Sends ("Replies")

:mynickname!myusername@$HOST JOIN #test
... to you and everyone in the channel

Send and Receive Messages

Send a message

PRIVMSG #test :Hello World!

Server Sends

:mynickname!myusername@$HOST PRIVMSG #test :Hello World!
... to everyone else in the channel

Request and Response

Example: Memcached

Memcached

  • in memory cache
  • simple commands for set/get etc
    • one response in reply for each request
  • line protocol with some length-prefixed content
  • picky about \r\n

Setting Keys

Step One: Setup the Storage

set key flags expiration length\r\n
  • set is the command
  • key is name of the key
  • flags ... isn't important, use 0

Setting Keys

Step One: Setup the Storage

set key flags expiration length\r\n
  • expiration is
    • zero never expires (by time)
    • unix timestamp
    • seconds from now
  • length in bytes (not including trailing CRLF)
set greeting 0 0 12\r\n

Setting Keys

Step Two: Send the Payload

payload bytes, correct length\r\n
  • payload
    • must be length from above
    • may contain newlines etc.
Hello World!\r\n

Setting Keys

Server Response

  • If successful
    • STORED\r\n
  • Otherwise
    • ERROR\r\n
    • CLIENT_ERROR\r\n
    • SERVER_ERROR\r\n

Getting Keys

Request One or More Keys

One key

get key1\r\n

Multiple keys

get key1 key2 key3\r\n

Getting Keys

Server Response

For each key requested


                VALUE key flags length\r\n
                payload bytes, correct length\r\n
              

Then finally

END\r\n

The Big One

Example: HTTP

HTTP

  • complex protocol
  • both parse and behavior depend on many factors
  • technically \r\n, most servers accept \n
  • this is not HTML, content can be anything/nothing

HTTP Message

  • Start line
  • Headers
  • Optional body

HTTP Start Line

Request

METHOD PATH HTTP/VERSION\r\n
Example:
GET / HTTP/1.1\r\n

Response

HTTP/VERSION STATUS MESSAGE\r\n
Example:
HTTP/1.1 200 OK\r\n

HTTP Headers

  • key-value pairs
    • keys and values are colon separated
    • pair ends in \r\n
  • keys are case insensitive
  • value may be very complex
  • headers end with empty line ending in \r\n

Script to Pipe Body

Request Host Header

  • Optional in HTTP/1.0
  • Technically required in HTTP/1.1
  • Actually required for name-based virtual hosting
  • Used with the start line path for absolute urls

HTTP Message Body (?)

  • only on some directions/methods/status codes
  • complete when
    • connection closed
      • HTTP/1.0
      • Connection: close
    • after Content-Length bytes and \r\n
    • via Transfer-Encoding: chunked

See Also

Conclusion

Go play with text-based protocols!