diff --git a/404.html b/404.html new file mode 100644 index 0000000..3e6df16 --- /dev/null +++ b/404.html @@ -0,0 +1 @@ +404 Not Found
Me

404 Not Found

Video from sakugabooru
\ No newline at end of file diff --git a/404.mp4 b/404.mp4 new file mode 100644 index 0000000..03b02e0 Binary files /dev/null and b/404.mp4 differ diff --git a/50x.html b/50x.html new file mode 100644 index 0000000..d24c3e5 --- /dev/null +++ b/50x.html @@ -0,0 +1,48 @@ + + + + + + 50x Borked + + + + + + + + + + + +
+
+

50[0234] Borked

+
+
+ +
+ Video from sakugabooru +
+
+
+ + + diff --git a/50x.mp4 b/50x.mp4 new file mode 100644 index 0000000..8e2b55f Binary files /dev/null and b/50x.mp4 differ diff --git a/contact/index.html b/contact/index.html new file mode 100644 index 0000000..8265f51 --- /dev/null +++ b/contact/index.html @@ -0,0 +1 @@ +Contact
Me

Contact

You can find me on GitHub as @AliiAhmadi and on Telegram as @Alii0Ahmadi. Also you can send me email.

\ No newline at end of file diff --git a/favicon.svg b/favicon.svg new file mode 100644 index 0000000..ab88930 --- /dev/null +++ b/favicon.svg @@ -0,0 +1,3 @@ + + + diff --git a/index.html b/index.html new file mode 100644 index 0000000..c3c2151 --- /dev/null +++ b/index.html @@ -0,0 +1 @@ +Ali Ahmadi
Me

Ali Ahmadi

Hey, there! I'm Ali. This is my personal blog. My areas of interest are algorithm, cosmological physics, cyber security and software engineering. I like to share my knowledge with others. I hope you like my posts.

I spend my free time working on hobby projects and sometimes I write about my experiences.

\ No newline at end of file diff --git a/posts/a-fresh-coat-of-paint/index.html b/posts/a-fresh-coat-of-paint/index.html new file mode 100644 index 0000000..3726465 --- /dev/null +++ b/posts/a-fresh-coat-of-paint/index.html @@ -0,0 +1,22 @@ +A Fresh Coat of Paint
Me

A Fresh Coat of Paint

I'm starting the new year with a new job. To paraphrase a friend, "it's just moving from one $BIGCORP to another", but it's still exciting. I worked my last gig for 5 years, so I'm nervous, but also very ready to do something new. While I'm doing one new thing I might as well do another. Taking some time off between jobs has given me enough breathing room to redo my website.

New Features

If you've been here before you'll probably have noticed a significant visual overhaul. The site is now in dark mode, has a more varied color palette, and is more responsive to differently sized viewports.

In addition to the visual changes on this site there are new features as well!

  1. Posts now have summaries thanks to Zola's summary feature, which allows you to use any content before a <!-- more --> comment in a page's Markdown.

  2. There are tags as well! These were actually always there, but I only put them in post front matter and didn't expose them on any pages. Tags are supported via Zola's taxonomies, which are much more complicated and powerful than simple tags demand.

  3. The metadata for site pages now includes Open Graph protocol and Twitter card support for a better display in social media.

What I Learned

Every time I update this website I learn something new. I continued to use vanilla HTML and CSS and eschew JavaScript, but was still blown away by how little I know in the webdev space.

In the interest of chronicling my newfound knowledge, here are a handful of the things I learned.

I've no doubt that I've committed some grave CSS sins with this revamp. Nevertheless, I enjoyed seeing what's possible and the important part is that the site works. 😅

Inspiration

I would be remiss not to mention the people who inpired these changes. The colorscheme uses a subset of Pavel Pertsev's gruvbox, which I've used as my syntax highlighting theme for years. Post metadata was modeled after Alexis King's site. I borrowed ideas for the Open Graph and Twitter card support from Andrew Kvalheim and Amos Wenger.

Ruud van Asseldonk deserves the lion's share of credit for these changes. I spent hours learning from their meticulously crafted CSS. This update wouldn't have been possible without everything I learned from them.

\ No newline at end of file diff --git a/posts/announcing-layabout/index.html b/posts/announcing-layabout/index.html new file mode 100644 index 0000000..e9b92b2 --- /dev/null +++ b/posts/announcing-layabout/index.html @@ -0,0 +1,74 @@ +Announcing Layabout
Me

Announcing Layabout

Today I'm announcing Layabout, my first official Python library. Layabout is a small event handling library on top of the Slack Real Time Messaging (RTM) API. You can get it right now on PyPI.

What's It Good For?

You can think of Layabout as a micro framework for building Slack bots. Since it wraps Slack’s RTM API it does best with tasks like interacting with users, responding to channel messages, and monitoring events. If you want more ideas on what you can do with it keep reading or check out the examples.

Why?

Why choose Layabout when the Slack Events API exists and there's already an officially supported events library? If these points resonate with you then Layabout is for you.

Why Not?

Layabout won't be for everone and that's OK. If these points resonate with you then you probably do want to use the official events library.

A Practical Example

If you want to download it and start playing with it as you read the rest of this blog post you can install it by running

pip install layabout
+

Once you've got Layabout installed let's take a look at what it's capable of by borrowing the code example right from its README.rst.

from pprint import pprint
+from layabout import Layabout
+
+app = Layabout()
+
+
+@app.handle('*')
+def debug(slack, event):
+    """ Pretty print every event seen by the app. """
+    pprint(event)
+
+
+@app.handle('message')
+def echo(slack, event):
+    """ Echo all messages seen by the app except our own. """
+    if event.get('subtype') != 'bot_message':
+        slack.rtm_send_message(event['channel'], event['text'])
+
+
+def someone_leaves(events):
+    """ Return False if a member leaves, otherwise True. """
+    return not any(e.get('type') == 'member_left_channel'
+                   for e in events)
+
+
+if __name__ == '__main__':
+    # Automatically load app token from $LAYABOUT_TOKEN and run!
+    app.run(until=someone_leaves)
+    print("Looks like someone left a channel!")
+

In 28 lines of code we've used Layabout to do the following:

Now that we've looked at what Layabout is, why you might want to use it, and how to use it let's look a bit deeper into its design and implementation.

Design

If you're familiar with the superb Flask library then Layabout probably looks eerily similar to you. That's no accident and hopefully Armin Ronacher thinks imitation is the sincerest form of flattery.

More concretely, I think Python decorators are a powerful combination of simplicity and flexibility. They also lend themselves particularly well to event-driven workflows.

The heart of Layabout is its aggressively simple Layabout.handle method. Its normal invocation just guarantees the decorated function will accept a SlackClient and an event as arguments before registering it as a particular type of handler.

Having access to those two arguments alone opens up a wealth of possibilities. To maximize developer freedom I wanted to provide as thin a wrapper as I could on top of the already excellent slackclient library. Giving direct access to a SlackClient instance meant I didn't have to write my own functions for calling out to the RTM API and I could also take advantage of its ability to call the Slack Web API as well.

I also took inspiration from pytest's pytest.mark.parametrize decorator to give handlers more versatility by adding an extra kwargs parameter.

from layabout import Layabout
+
+app = Layabout()
+
+if __name__ == '__main__':
+    name = input('What is your name? ')
+
+    @app.handle('hello', kwargs={'name': name})
+    def hello(slack, event, name):
+        print(f"Hello! My name is {name}.")
+
+    app.run()  # Run forever.
+

By adding a kwargs parameter we can not only use Layabout.handle as a decorator, but also to register functions at runtime with dynamic data. For example, this code logs events that happen, but only if they're in particular channels:

from layabout import Layabout
+
+app = Layabout()
+
+
+def log_for_channels(slack, event, channels):
+    """ Log the event if it happened in a channel we care about. """
+    if event['channel'] in channels:
+        print(f"{event['type']} happened in {event['channel']}!")
+
+
+if __name__ == '__main__':
+    # A mapping of events to their respective channels.
+    event_channels = (
+        ('star_added', ('G1A8FG8AE', 'C03QZSL29')),
+        ('star_removed' ('C47CSFJRK', 'C045BMR29', 'G13RTMGXY')),
+    )
+
+    # For each event register a new handler for specific channels.
+    for event, channels in event_channels:
+        app.handle(event, kwargs={'channels': channels})(log_for_channels)
+
+    app.run()  # Run forever.
+

You could also use a closure or default arguments on a normal function definition for this and it might look a little cleaner, but for passing runtime data to a lot of functions those can be tedious options.

Ultimately, I tried to write a library that I would want to use. I'm more excited now than ever to work with Slack's APIs, so in that regard I think this library is already a success.

Implementation

One of the hallmarks of this library is that it only supports Python 3.6+. I specifically chose to use only the most recent Python for three reasons:

  1. I wanted take advantage of all the new language features like type annotations, f-strings, better destructuring assignment, etc.
  2. I didn't want to limit myself to the least common denominator by worrying about backwards compatibility.
  3. 2020 is fast approaching, folks. Use Python 3 already. If you intend to keep Python as part of your stack you're rapidly running out of excuses not to modernize.

I normally try to drink as little of the Object Oriented Kool-Aid as possible, so I tried a functional approach first, but keeping track of what was going on with connection state with a class just made sense to me. It also ended up being cleaner to keep a handler registry on an instance. Since they're self-contained you could conceivably spawn multiple instances into their own threads/processes and run them all simultaneously if you're careful with your global mutable state.

Async

Unfortunately I didn't see an easy way to use Python 3's async def because of the synchronous nature of slackclient's SlackClient.rtm_read method. This is a Python 3 feature I'd really like to learn more about and event handling and async seem like a natural fit to me. If there's ever a reason to release a Layabout v2.0 I will probably push harder in this direction.

Type Annotations

From a development stance, the best part about this entire project so far has been learning how to use Python 3 type annotations. I miss them whenever I'm working with a project that doesn't have them.

I did have one minor annoyance while working with type annotations. Layabout keeps an internal collection of all the event handlers that have been registered to it with this signature.

# Private type alias for the complex type of the handlers defaultdict.
+_Handlers = DefaultDict[str, List[Tuple[Callable, dict]]]
+

I wanted to be even more restrictive and specify exactly what was required of the Callable by defining a type alias for a Handler. The restrictions I sought to specify were:

I took a stab at expressing this as

Handler = Callable[[SlackClient, Dict[str, Any], ...], Any]
+

Sadly, it would seem this doesn't work. Right now mypy complains with a

error: Unexpected '...'
+

I've been up and down the Python typing project, but even after visiting issue #193 and issue #264 can't find a simple syntax for expressing a function that has a minimum arity of two with required types, but is variadic thereafter and generic in the types it accepts.

There may, in fact, be a way to express this type with current annotations, but I haven't figured out what it is yet. It may also be the case that the difficulty in expressing this type is an indicator that a better API exists and should be preferred. For now I've settled on just declaring that a Handler is a Callable. I've got an auxiliary function that validates handlers to let users know if they've omitted a required positional argument.

Despite that small inconvenience type annotations are awesome! Go use them! I now firmly believe that supplemental static analysis makes for better software, even in dynamically typed languages.

Run Method

As a final note on implementation, the Layabout.run method only has an until parameter because it made it so much easier for me to unit test. If you go read the tests you'll notice many of them get called with

layabout.run(until=lambda e: False)
+

which saves me from the headache of trying to test an otherwise infinite loop. If until is None then Layabout.run just uses its own private function

def _forever(events: List[Dict[str, Any]]) -> bool:  # pragma: no cover
+    """ Run Layabout in an infinite loop. """
+    return True
+

Giving the looping conditional access to the events opened up enough possibilities that I decided to keep it as part of the design.

Thanks

I want to extend a special thank you to Alex LordThorsen, Geoff Shannon, Kyle Rader, and Mike Canoy for their help during the initial development of this library. In particular the feedback I got on PR #2 was incredible and radically changed the library for the better.

What's Next?

If you're still here and Layabout sounds like fun to you then check out these links to get started.

I happily entertain pull requests, so if something's not quite right feel free to jump in and submit your own fix if you're able. Happy Slacking!

\ No newline at end of file diff --git a/posts/avatar-png/avatar.png b/posts/avatar-png/avatar.png new file mode 100644 index 0000000..54cd36d Binary files /dev/null and b/posts/avatar-png/avatar.png differ diff --git a/posts/avatar-png/bad-ipv6.png b/posts/avatar-png/bad-ipv6.png new file mode 100644 index 0000000..e0cef33 Binary files /dev/null and b/posts/avatar-png/bad-ipv6.png differ diff --git a/posts/avatar-png/blank-canvas.png b/posts/avatar-png/blank-canvas.png new file mode 100644 index 0000000..da2efeb Binary files /dev/null and b/posts/avatar-png/blank-canvas.png differ diff --git a/posts/avatar-png/font-for-ants.png b/posts/avatar-png/font-for-ants.png new file mode 100644 index 0000000..63aacbc Binary files /dev/null and b/posts/avatar-png/font-for-ants.png differ diff --git a/posts/avatar-png/index.html b/posts/avatar-png/index.html new file mode 100644 index 0000000..a91233d --- /dev/null +++ b/posts/avatar-png/index.html @@ -0,0 +1,295 @@ +avatar.png
Me

avatar.png

No, not that Avatar. And not the other one either. This post is about avatar.png, a handful of lines of PHP that have inspired me for a long time.

Around 2011 or 2012 a friend of mine, Andrew Kvalheim, blew my mind when he made his Skype profile picture display the IP address of the computer I was using. It might have looked a bit like this.

White text on
+a pinkish/purple background which says 'Hello, 140.160.254.56!'.

We were both working at our university's IT help desk and I think Skype for Business had just been rolled out to employees. If memory serves the application let you upload a profile picture or give a URL for one. The image was always fetched by the client, which made it possible to display a different image to each person viewing your profile picture.

For me this was practically magic. I was fairly early in my programming education at that point, so it took me a long time to really understand what was going on.

The web server was serving a file named avatar.png. When viewed in a browser you got a PNG containing your IP address, but when you opened the file itself in a text editor it was PHP! How could one file be two different things depending on how you looked at it?

I later learned that file extensions are mostly a suggestion and that with enough creativity you can make computers to do a lot of fun things. That file can be two things at once, it just depends on your point of view.

Reflecting on the inspiration I've gotten from this simple program, I've spent a bit of time translating avatar.png from PHP to Rust. I learned way more than I bargained for in the process. Hopefully you'll learn something too as you read this.

PHP

Here is the original PHP which generated avatar.png. Judging by when this was written I think it probably would have run on PHP 5.3 or 5.4.

<?php
+
+    //Get IP address
+    $ip = explode('.', $_SERVER['REMOTE_ADDR'], 4);
+
+    //Render image
+    $image = @imagecreate(256, 256)
+        or die("Cannot Initialize new GD image stream");
+    $background_color = imagecolorallocate($image, 119, 41, 83);
+    $text_color = imagecolorallocate($image, 255, 255, 255);
+    imagettftext($image, 24, 0, 8, 96, $text_color, 'UbuntuMono-Regular.ttf', "Hello, \n$ip[0].$ip[1].$ip[2].$ip[3]!");
+
+    //Send response
+    header('Content-Type: image/png');
+    imagepng($image);
+    imagedestroy($image);
+?>
+

In case you're not overly familiar with PHP, here's a quick rundown on what's happening there.

I want to note that while reading the PHP docs I discovered that a significant amount of this code overlaps with the first imagecreate example. I think this showcases the benefits of quickly copying what you need and adapting it to your purposes. As we become more experienced software engineers we often over-engineer the heck out of things (that's half of what this post is about). But there's real joy in just grabbing what you find and using it as-is, especially for low-stakes fun.

Rust

Ok. Now that we understand the PHP well enough to translate it, let's set some ground rules.

  1. I like it when blog posts build up solutions, showing mistakes and oddities on the way. If you want to skip all that, here's the finished product.

  2. I assume a basic level of Rust understanding. I don't expect you to have read The Book cover to cover, but I'll skip over many explanations.

  3. As I translate this, keep in mind that PHP is a language made for the web. I'm not competing for brevity and certainly not trying to play code golf. The Rust code WILL be longer.

  4. I'll cut some corners in the initial implementation for the sake of understanding, but try to tidy things up by the end.

Choosing A Framework

The original PHP was likely run in Apache using mod_php, which appears to be out of style these days. In Rust we don't necessarily run a separate server like Apache or Nginx. Instead the application and server are compiled into the same binary and we choose between frameworks. I've been enjoying Axum lately, so that's what I used, but I'm sure Actix or Rocket would have been fine too.

First, we create a new Rust project and add our dependencies.

$ cargo new avatar && cd avatar
+$ cargo add axum@0.7.3
+$ cargo add tokio@1.35.1 --features=rt-multi-thread,macros
+

Then, we add Axum's "Hello, World!" example to src/main.rs and build up from there.

use axum::{routing::get, Router};
+
+#[tokio::main]
+async fn main() {
+    let app = Router::new().route("/", get(|| async { "Hello, World!" }));
+
+    let listener = tokio::net::TcpListener::bind("0.0.0.0:3000").await.unwrap();
+    axum::serve(listener, app).await.unwrap();
+}
+

Getting the IP

Going off the PHP example, the first thing to do is replicate the behavior of $_SERVER['REMOTE_ADDR'] and get the IP address of the client connecting to the server. PHP automagically populates $_SERVER with this information, but Axum wants us to be clear about our needs, so this gets a bit more complicated right away.

use axum::{extract::ConnectInfo, routing::get, Router};
+use std::net::SocketAddr;
+
+#[tokio::main]
+async fn main() {
+    let app = Router::new().route(
+        "/",
+        get(|ConnectInfo(addr): ConnectInfo<SocketAddr>| async move {
+            format!("Hello,\n{}!", addr.ip())
+        }),
+    );
+
+    let listener = tokio::net::TcpListener::bind("0.0.0.0:3000").await.unwrap();
+    let make_service = app.into_make_service_with_connect_info::<SocketAddr>();
+    axum::serve(listener, make_service).await.unwrap();
+}
+

Axum also exposes connection information, but not quite as automagically. This information is given to a handler (the closure we give to get) via an extractor. If that all sounds very abstract, it's because it is.

Specifically, we use the ConnectInfo<T> extractor as an argument to our closure and destructure it to get a SocketAddr (the desired T). These types can't be inferred, so our handler arguments get a bit verbose. This extractor also requires we create our app using into_make_service_with_connect_info<C> , which is a long way of saying "let my app get connection info". That behavior is not enabled by default.

Astute readers will have noticed that we also added the move keyword to our async block. Without this our friendly compiler steps in to give a lecture on borrowing and ownership.

error[E0373]: async block may outlive the current function, but it borrows `addr`, which is owned by the current function
+  --> src/main.rs:8:58
+   |
+8  |           get(|ConnectInfo(addr): ConnectInfo<SocketAddr>| async {
+   |  __________________________________________________________^
+9  | |             format!("Hello,\n{}!", addr.ip())
+   | |                                   ---- `addr` is borrowed here
+10 | |         }),
+   | |_________^ may outlive borrowed value `addr`
+   |
+note: async block is returned here
+  --> src/main.rs:8:58
+   |
+8  |           get(|ConnectInfo(addr): ConnectInfo<SocketAddr>| async {
+   |  __________________________________________________________^
+9  | |             format!("Hello,\n{}!", addr.ip())
+10 | |         }),
+   | |_________^
+help: to force the async block to take ownership of `addr` (and any other referenced variables), use the `move` keyword
+   |
+8  |         get(|ConnectInfo(addr): ConnectInfo<SocketAddr>| async move {
+   |                                                                ++++
+
+For more information about this error, try `rustc --explain E0373`.
+

The closure we've written captures addr by reference because addr.ip() borrows self. However, because the return of that closure is the whole async block, itself a Future, that reference is immediately invalidated. Thankfully the compiler warns us and tells us what to do. So helpful! 😎 The move gives ownership of addr to the returned Future.

The other way to get around this is to make our handler a function instead of a closure.

async fn avatar(ConnectInfo(addr): ConnectInfo<SocketAddr>) -> String {
+    format!("Hello,\n{}", addr.ip())
+}
+

This also makes our app declaration prettier, so let's go with that.

let app = Router::new().route("/avatar.png", get(avatar));
+

Notice that we also changed the route to /avatar.png to match how the PHP was served. We can verify this works as intended with curl.

$ curl http://localhost:3000/avatar.png
+Hello,
+127.0.0.1!
+

Creating a PNG

Unfortunately, the assignment wasn't to return the client's IP address in plaintext. For parity with the PHP we need to serve an image. Fortunately, the image crate exists.

$ cargo add image@0.24.7
+

Background

The image crate allows us to create a PNG in a fashion similar to the PHP. The analog of @imagecreate is to create an ImageBuffer. Instead of imagecolorallocate, the ImageBuffer struct has a convenient from_pixel method which allows us to specify a starting pixel that is then copied across our new canvas. We can start with a single Rgb pixel.

use image::{ImageBuffer, Rgb};
+
+const WIDTH: u32 = 256;
+const HEIGHT: u32 = WIDTH;
+const BACKGROUND_COLOR: Rgb<u8> = Rgb([177, 98, 134]);
+
+// ...
+
+let img = ImageBuffer::from_pixel(WIDTH, HEIGHT, BACKGROUND_COLOR);
+

File Format

The resulting image buffer is not yet an image though. It's pretty much still a multi-dimensional array of integers. To construct a PNG someone can actually see we need to jam those integers into the PNG file format. Sadly for us, the equivalent of PHP's imagepng is nowhere near as convenient.

If you use ImageBuffer's save method to write the buffer out as a file

img.save("avatar.png").unwrap();
+

you'll get a blank canvas like this.

Nothing
+but a pinkish/purple background color.

Sure enough, that's a PNG, but using save is disastrous to us for a few reasons.

Instead, ImageBuffer has a write_to method which

[w]rites the buffer to a writer in the specified format.

In this case a "writer" is some type, W, which implements the Write and Seek traits. Rust's standard library gives us such a W in the form of std::io::Cursor<T>. We can use a Vec<u8> for our cursor's buffer type, T.

let mut cursor = Cursor::new(vec![]);
+

As for the "specified format", save has some logic to infer output format from file extension, but with write_to we can just pass ImageOutputFormat::Png.

img.write_to(&mut cursor, ImageOutputFormat::Png).unwrap();
+

The Vec<u8> wrapped by our cursor now contains all the bytes for a proper (albeit blank) PNG. We can work with that Vec<u8> directly by consuming the cursor with into_inner.

Serving the Image

At this point we need to tell Axum how to serve the image we've created. How do we turn a Vec<u8> into a response that a client will understand as an image?

Axum knows how to serve Vec<u8> out of the box, but if we change the handler's signature to return just that we'll have undesired behavior.

async fn avatar(ConnectInfo(addr): ConnectInfo<SocketAddr>) -> Vec<u8> {
+    // ..
+    cursor.into_inner()
+}
+

Check that with curl and you'll see a response like

$ curl --head http://localhost:3000/avatar.png
+HTTP/1.1 200 OK
+content-type: application/octet-stream
+content-length: 1726
+date: Thu, 28 Dec 2023 02:30:17 GMT
+

Note that the Content-Type header is application/octet-stream and not image/png. We need analogs for PHP's header and imagepng in order to tell the client the response is a PNG.

We could build an appropriate Response ourselves, but the magic of Axum's IntoResponse trait provides a clear, terse syntax for this that I find preferable.

async fn avatar(ConnectInfo(addr): ConnectInfo<SocketAddr>) -> impl IntoResponse {
+    // ...
+    ([(header::CONTENT_TYPE, "image/png")], cursor.into_inner())
+}
+

We return a tuple with an array mapping header names to values and the bytes for the body. Axum's blanket implementations for IntoResponse do all the work to figure out how to turn that into an HTTP response.

Putting it all together our current handler looks like this.

async fn avatar(ConnectInfo(addr): ConnectInfo<SocketAddr>) -> impl IntoResponse {
+    let _text = format!("Hello,\n{}", addr.ip());
+    let img = ImageBuffer::from_pixel(WIDTH, HEIGHT, BACKGROUND_COLOR);
+
+    let mut cursor = Cursor::new(vec![]);
+    img.write_to(&mut cursor, ImageOutputFormat::Png).unwrap();
+
+    ([(header::CONTENT_TYPE, "image/png")], cursor.into_inner())
+}
+

The _text is notably being ignored right now. We can get IP addresses, we can create PNGs, and we can serve them. Now what remains is to put the text in the image.

Adding Text

For the Rust analog of PHP's imagettftext we need a way to draw text on our image. The image crate doesn't provide any routines for manipulating text, but it does recommend the imageproc crate, which is maintained by the same organization.

$ cargo add imageproc@0.23.0
+

This crate provides a draw_text_mut function, which will draw text onto an existing image. From its signature we can gather it needs a whopping 7 arguments (PHP's imagettftext is 8, so maybe I shouldn't complain). Naturally, these aren't really documented, but we can learn a lot from Rust signatures alone.

That feels like a lot, but we already have most of what we need. Luckily our existing ImageBuffer satisfies the Canvas trait and we already know it's using Rgb pixels which satisfy the Pixel trait. The x and y coordinates were given in the original PHP and we already have our _text. We only need a Scale and a Font. To work with both we'll need the rusttype crate.

$ cargo add rusttype@0.9.3
+

Getting a Font

The font used in the original PHP was Ubuntu Mono, which is freely available for download. We just need to put the file alongside our Rust code.

In PHP-land with imagettftext we just specified the path to a TrueType font file (UbuntuMono-Regular.ttf) and went on our merry way. Our Rust libraries want us to create a Font, which requires us to load the contents of that font file into our application.

We could do this on every request, which I think is what the PHP does. Or, we could do one better and bake the font directly into our application with Rust's include_bytes! macro. I threw in the concat! and env! macros as well for completeness.

const FONT_DATA: &[u8] = include_bytes!(concat!(
+    env!("CARGO_MANIFEST_DIR"),
+    "/fonts/UbuntuMono-R.ttf"
+));
+
+// ...
+
+let font = Font::try_from_bytes(FONT_DATA).unwrap();
+

Unfortunately, while the FONT_DATA can be const the Font itself can't, but we can work with this for now.

Setting a Scale

The last piece of information we need to draw text is a font Scale. According to the docs the scale is defined in pixels. However, PHP's imagettftext specifies a size in points. The difference between pixels and points is a tricky business, but for our purposes we can take the Easy Mode ™ route by assuming that a point is defined at a 3:4 ratio to a pixel. Thus, from the original font size of 24 we arrive at a scale of 32.

const SCALE: Scale = Scale { x: 32.0, y: 32.0 };
+

Putting It All Together

With Font, Scale, and all our other arguments in hand we can finally draw text on the image.

const X: i32 = 8;
+const Y: i32 = 96;
+const WIDTH: u32 = 256;
+const HEIGHT: u32 = WIDTH;
+const TEXT_COLOR: Rgb<u8> = Rgb([235, 219, 178]);
+const BACKGROUND_COLOR: Rgb<u8> = Rgb([177, 98, 134]);
+const SCALE: Scale = Scale { x: 32.0, y: 32.0 };
+const FONT_DATA: &[u8] = include_bytes!(concat!(
+    env!("CARGO_MANIFEST_DIR"),
+    "/fonts/UbuntuMono-R.ttf"
+));
+
+async fn avatar(ConnectInfo(addr): ConnectInfo<SocketAddr>) -> impl IntoResponse {
+    let text = format!("Hello,\n{}", addr.ip());
+    let font = Font::try_from_bytes(FONT_DATA).unwrap();
+    let mut img = ImageBuffer::from_pixel(WIDTH, HEIGHT, BACKGROUND_COLOR);
+    draw_text_mut(&mut img, TEXT_COLOR, X, Y, SCALE, &font, &text);
+
+    let mut cursor = Cursor::new(vec![]);
+    img.write_to(&mut cursor, ImageOutputFormat::Png).unwrap();
+
+    ([(header::CONTENT_TYPE, "image/png")], cursor.into_inner())
+}
+

That handler will get us an image that looks something like this.

White text
+on a pinkish/purple background which says 'Hello,□127.0.0.'. After the last . a
+1 is only partially visible.

Which... doesn't really look right, does it? What the heck is the and why is the IP address cut off instead of being on a new line?

Handling Newlines

We find the answer, as we often do, by reading more closely. draw_text_mut's docs clearly say

Note that this function does not support newlines, you must do this manually.

That also offers an explanation for what the is. By digging into the imageproc source we can see that it ultimately calls Font::glyph, which says

Note that code points without corresponding glyphs in this font map to the “.notdef” glyph, glyph 0.

Since the newline character \n is a control character it's not in the font itself and thus we get the .notdef glyph instead. This also explains why the rusttype crate (and by extension imageproc) doesn't support newlines.

So how can we "do this manually"? Neither imageproc nor rusttype offer any specific advice. The easiest approach seems to be to just split the text up ourselves and draw what goes on the next line with a new y offset.

    let ip = addr.ip();
+    draw_text_mut(&mut img, TEXT_COLOR, X, Y, SCALE, &font, "Hello,");
+    let y = Y + SCALE.y as i32;
+    draw_text_mut(&mut img, TEXT_COLOR, X, y, SCALE, &font, &format!("{ip}!"));
+

I chose to just add SCALE.y to the original Y, which is playing fast and loose with concepts like line height, but seems to work out well enough. At last, we can reproduce the original PHP with output that looks something like this.

White
+text on a pinkish/purple background which says 'Hello,' on one line and
+'127.0.0.1!' on the next.

Room for Improvement

I've not gone back and run an old Apache instance, so I can't say with 100% certainty that we've got a pixel perfect replica of the original, but I think it's darn close. There's still more to think about though. Here's a grab bag of potential improvements, some I've made and some I've saved for another day.

Using One Font

Earlier I mentioned that our Font couldn't be const. That's true, but with a little effort it can at least be static. I don't always love globals, but it feels silly to create a new Font on each request when it could be the same darn font every time.

I initially thought to use the lazy_static crate for this purpose, but since Rust 1.70.0 stablized OnceLock I thought I'd give that a try. Why not use a standard library approach if you can?

Until LazyLock stablizes it seems like the most ergonomic use of OnceLock for what we want involves creating an accessor function.

use std::sync::OnceLock;
+
+fn font() -> &'static Font<'static> {
+    static FONT: OnceLock<Font> = OnceLock::new();
+    FONT.get_or_init(|| Font::try_from_bytes(FONT_DATA).expect("Built-in font data was invalid"))
+}
+

At the call site we just swap a &font for a font() and we're done. One Font.

Normally I avoid potential panics as much as I can, but the only way that Font::try_from_bytes can fail is if the .ttf file is invalid at compile time, so I felt comfortable using expect.

Error Handling

Yeah, the PHP just uses or die, but this is Rust, so we should try harder.

A panic in a completely unrecoverable situation which indicates your software was built wrong seems acceptable for now, but we should fix all of our unwrap calls. While not strictly necessary, it helps to have good supporting libraries.

$ cargo add anyhow@1.0.78
+$ cargo add thiserror@1.0.53
+

Server Errors

The unwraps in main can be neatly ?'d out of existence with anyhow and Result in main. If they fail the program can't reasonably continue, but we should try to exit cleanly and allow an admin to read something about what happened.

#[tokio::main]
+async fn main() -> anyhow::Result<()> {
+    let app = Router::new().route("/avatar.png", get(avatar));
+
+    let listener = tokio::net::TcpListener::bind("0.0.0.0:3000").await?;
+    let make_service = app.into_make_service_with_connect_info::<SocketAddr>();
+    axum::serve(listener, make_service).await?;
+    Ok(())
+}
+

Handler Errors

The remaining unwrap in avatar benefits from a more careful approach. If converting one ImageBuffer into a Vec<u8> fails for a single request the whole application shouldn't crash. This failure point comes from

img.write_to(&mut cursor, ImageOutputFormat::Png).unwrap();
+

The write_to method returns a Result<T, E> whose E is an ImageError. Unfortunately, it's not quite as simple as returning Result<impl IntoResponse, image::ImageError> from our handler. Trying to do so yields a rare and disappointing mystery compilation error too long to reproduce here.

After sleuthing in Axum's error handling docs you can discover that the Err variant for a Result should implement IntoResponse itself. Frustratingly, our application code owns neither the ImageError type nor the IntoResponse trait, so Rust's orphan rules prevent us from implementing this directly ourselves. The easiest solution is to make a wrapper.

#[derive(Debug, thiserror::Error)]
+#[error("Failed to generate image: {0}")]
+struct AvatarError(#[from] image::ImageError);
+

The thiserror crate makes this blessedly straightforward. An AvatarError wraps an ImageError and can automatically be converted from one thanks to the ease of #[from].

impl IntoResponse for AvatarError {
+    fn into_response(self) -> axum::response::Response {
+        (StatusCode::INTERNAL_SERVER_ERROR, self.to_string()).into_response()
+    }
+}
+

Having to implement this trait for our own error makes a little more sense now. Without it how could Axum have known that this particular error should be an Internal Server Error?

Now we're allowed to define our handler as

async fn avatar(
+    ConnectInfo(addr): ConnectInfo<SocketAddr>,
+) -> Result<impl IntoResponse, AvatarError> {
+    // ...
+}
+

and we can use ? just like we hoped to.

IPv6 Support

Until now we've assumed that everything is using IPv4, but it's the tail end of 2023, so we should probably at least think about IPv6. Unfortunately things get ugly in a hurry.

In fairness though, I'm not sure how well the original PHP's

<?php
+    $ip = explode('.', $_SERVER['REMOTE_ADDR'], 4);
+?>
+

handled IPv6, so this is well off the beaten path.

Allowing IPv6 Connections

With what's been written so far if you try to make an IPv6 connection you're gonna have a bad time.

$ curl -6 http://localhost:3000/avatar.png  
+curl: (7) Failed to connect to localhost port 3000: Connection refused
+

This is because we told the server to listen on 0.0.0.0 as a special shorthand for "listen on all network interfaces". 0.0.0.0 is specific to IPv4, so we aren't listening for any incoming IPv6 connections. We can fix that by listening on ::, the IPv6 equivalent.

let listener = tokio::net::TcpListener::bind("[::]:3000").await?;
+

I've wrapped the :: with [] because it helps readability by disambiguating between : as part of the address and : as an address/port separator. Also because RFC 2732 said so.

Listening on an IPv6 fixes the problem and also still works for IPv4... kinda... Apparently the exact semantics are OS-dependent. On Windows it only binds for IPv6 and on Linux it's dependent on kernel configuration.

I think in order to make it work in both Windows and Linux you'd have to

I couldn't be bothered. I'd choose to run the application on Linux and I control the kernel, so it's working for me. 🤷‍♂️

Displaying IPv4 Correctly

Just because we can accept IPv4 and IPv6 connections doesn't mean it's actually working the way we want it to though. Letting :: bind both IPv4 and IPv6 has another side effect. IPv4 addresses are interpeted as IPv4-mapped IPv6 addresses.

If you curl via IPv4 specifically with

$ curl -4 -O http://localhost:3000/avatar.png  
+

you're likely to see something like this.

White
+text on a pinkish/purple background which says 'Hello,' on one line and
+'::ffff:127.0.0.' on the next. After the last . a 1 is only partially
+visible.

In this form an IPv4 address like 127.0.0.1 is represented in an IPv6 address format like ::ffff:127.0.0.1. The IPv4 form is the "canonical" form. Luckily Rust's IpAddr type has a method for this, to_canonical.

let ip = addr.ip().to_canonical();
+

This was actually just added in Rust 1.75.0 mere days ago, well after I started writing this. You can just take my word that this was rather more of a pain in the ass to deal with before that.

IPv4 works again!

White
+text on a pinkish/purple background which says 'Hello,' on one line and
+'127.0.0.1!' on the next.

IPv6 also works!

White
+text on a pinkish/purple background which says 'Hello,' on one line and '::1!'
+on the next.

Haha, just kidding, it absolutely does not. Most IPv6 addresses are way longer than ::1.

White text
+on a pinkish/purple background which says 'Hello,' on one line and
+'2001:0db8:85a3:' on the next. After the last : a 0 is only partially
+visible.

Displaying IPv6 Correctly

Out of all the problems we've dealt with so far I think this is the thorniest. Unlike with the newline issue, the text simply doesn't fit in the image. At this font size, if you get a real-world IPv6 address you're more than likely going to draw it right off the edge of the canvas.

Just decrease the font size, you might say...

White
+text on a pinkish/purple background which says 'Hello,' on one line and
+'2001:0db8:85a3:0000:0000:8a2e:0370:7334!' on the next. The text all fits
+within the image, but is extremely small compared to similar images and not
+easily read.

Your eyes must be better than mine. That's not easy to read. I had to set the Scale to 12.0 to make that fit. It looks even sillier with ::1, I promise.

OK, well just make the PNG wider then!

White text
+on a pinkish/purple background which says 'Hello,' on one line and
+'2001:0db8:85a3:0000:0000:8a2e:0370:7334!' on the next. The text all fits
+within the image, but the image itself is extremely large compared to similar
+images.

Cool, now it's 648 pixels wide. If you're reading on mobile it might well exceed the width of your screen. It also looks silly with ::1, btw. Additionally, it likely violates an original constraint that the image be 256 x 256 pixels to serve as a profile picture.

Uhh... why are you making this so complicated? We did newlines before, wrap the text!

White
+text on a pinkish/purple background which says 'Hello,' on the first line,
+'2001:0db8:85a3:' on the second, '0000:0000:8a2e:' on the third line, and
+'0370:7334!' on the fourth line. The text is not centered in the image.

Huh. Actually, I don't hate that, but it no longer looks centered. Also, IPv6 addresses aren't guaranteed to have a uniform length. There are several different ways to represent them. For example, 2001:0db8:0000:0000:0000:8a2e:0370:7334 can also be written as 2001:db8::8a2e:370:7334. Furthermore, RFC 5952 recommends that

Leading zeros MUST be suppressed.

and

The use of the symbol "::" MUST be used to its maximum capability.

so if we play by the rules different addresses might take up a different number of lines. Really, both width and height could be variable depending on our solution.

Is There Nothing We Can Do?

We've considered 3 solutions, all of which have their own issues.

  1. Shrinking the font size.
    • Mostly illegible.
  2. Increasing the image width.
    • Looks awkward for variable width addresses.
    • Likely violates an original constraint for 256 x 256 pixels.
  3. Wrapping the text.
    • Looks awkward for variable lines.

Of the 3 solutions I think text wrapping is the most appropriate, but I ran out of time to pursue an implementation before writing this. I feel comfortable leaving a solution as an exercise to the reader and would love to hear what others would do.

SVG

I'm sure that by now at least one person has been screaming "Use SVG!" in their mind. Personally, I'm not convinced it's necessarily a better fit, but it's not something I know much about, so I'm open to being wrong.

I applied what little I know to create an SVG.

<svg width="256" height="256" xmlns="http://www.w3.org/2000/svg">
+  <style>
+  @font-face {
+    font-family: "Ubuntu Mono";
+    src: url(/fonts/UbuntuMono-Regular.ttf) format('truetype');
+  }
+  </style>
+  <rect width="100%" height="100%" fill="#12cc44"/>
+  <text x="8" y="96" font-size="24" font-family="Ubuntu Mono" fill="#ebdbb2">
+    <tspan x="0" dy="1em">Hello,</tspan>
+    <tspan x="0" dy="1em">127.0.0.1!</tspan>
+  </text>
+</svg>
+

I left the font out for this particular image, but it yields something like this.

Hello, 127.0.0.1!

Here are my observations given what I assume about SVGs, PNGs, and this problem.

Benefits

Downsides

I don't see the benefits hugely outweighing the downsides, especially when keeping in mind what likely would have been the original constraints. What we've got should do for now.

The Finished Product

I can't believe you read this far. The final code is available on GitHub, but this post didn't feel complete without putting everything I talked about in one place for comparision with the PHP.

It's significantly longer than the original and probably more complicated than it needs to be, but it mostly works and that's worth something.

use std::{io::Cursor, net::SocketAddr, sync::OnceLock};
+
+use axum::{
+    extract::ConnectInfo,
+    http::{header, StatusCode},
+    response::IntoResponse,
+    routing::get,
+    Router,
+};
+use image::{ImageBuffer, ImageOutputFormat, Rgb};
+use imageproc::drawing::draw_text_mut;
+use rusttype::{Font, Scale};
+
+const X: i32 = 8;
+const Y: i32 = 96;
+const WIDTH: u32 = 256;
+const HEIGHT: u32 = WIDTH;
+const TEXT_COLOR: Rgb<u8> = Rgb([235, 219, 178]);
+const BACKGROUND_COLOR: Rgb<u8> = Rgb([177, 98, 134]);
+const SCALE: Scale = Scale { x: 32.0, y: 32.0 };
+const FONT_DATA: &[u8] = include_bytes!(concat!(
+    env!("CARGO_MANIFEST_DIR"),
+    "/fonts/UbuntuMono-R.ttf"
+));
+
+#[derive(Debug, thiserror::Error)]
+#[error("Failed to generate image: {0}")]
+struct AvatarError(#[from] image::ImageError);
+
+impl IntoResponse for AvatarError {
+    fn into_response(self) -> axum::response::Response {
+        (StatusCode::INTERNAL_SERVER_ERROR, self.to_string()).into_response()
+    }
+}
+
+fn font() -> &'static Font<'static> {
+    static FONT: OnceLock<Font> = OnceLock::new();
+    FONT.get_or_init(|| Font::try_from_bytes(FONT_DATA).expect("Built-in font data was invalid"))
+}
+
+async fn avatar(
+    ConnectInfo(addr): ConnectInfo<SocketAddr>,
+) -> Result<impl IntoResponse, AvatarError> {
+    // Wow, IPv6 causes a lot of headache. 😵‍💫
+    let ip = addr.ip().to_canonical();
+    let mut img = ImageBuffer::from_pixel(WIDTH, HEIGHT, BACKGROUND_COLOR);
+
+    draw_text_mut(&mut img, TEXT_COLOR, X, Y, SCALE, font(), "Hello,");
+    let y = Y + SCALE.y as i32;
+    draw_text_mut(&mut img, TEXT_COLOR, X, y, SCALE, font(), &format!("{ip}!"));
+
+    let mut cursor = Cursor::new(vec![]);
+    img.write_to(&mut cursor, ImageOutputFormat::Png)?;
+
+    Ok(([(header::CONTENT_TYPE, "image/png")], cursor.into_inner()))
+}
+
+#[tokio::main]
+async fn main() -> anyhow::Result<()> {
+    let app = Router::new().route("/avatar.png", get(avatar));
+
+    let listener = tokio::net::TcpListener::bind("[::]:3000").await?;
+    let make_service = app.into_make_service_with_connect_info::<SocketAddr>();
+    axum::serve(listener, make_service).await?;
+    Ok(())
+}
+

Thanks for reading! Maybe I'll learn to write smaller posts next year. 🤣

\ No newline at end of file diff --git a/posts/avatar-png/ipv4-mapped.png b/posts/avatar-png/ipv4-mapped.png new file mode 100644 index 0000000..1a884d9 Binary files /dev/null and b/posts/avatar-png/ipv4-mapped.png differ diff --git a/posts/avatar-png/localhost-ipv4.png b/posts/avatar-png/localhost-ipv4.png new file mode 100644 index 0000000..f89707d Binary files /dev/null and b/posts/avatar-png/localhost-ipv4.png differ diff --git a/posts/avatar-png/localhost-ipv6.png b/posts/avatar-png/localhost-ipv6.png new file mode 100644 index 0000000..79d6c43 Binary files /dev/null and b/posts/avatar-png/localhost-ipv6.png differ diff --git a/posts/avatar-png/no-newline.png b/posts/avatar-png/no-newline.png new file mode 100644 index 0000000..1a94c92 Binary files /dev/null and b/posts/avatar-png/no-newline.png differ diff --git a/posts/avatar-png/wide-boi.png b/posts/avatar-png/wide-boi.png new file mode 100644 index 0000000..436c6fc Binary files /dev/null and b/posts/avatar-png/wide-boi.png differ diff --git a/posts/avatar-png/wrapped-text.png b/posts/avatar-png/wrapped-text.png new file mode 100644 index 0000000..cff0229 Binary files /dev/null and b/posts/avatar-png/wrapped-text.png differ diff --git a/posts/deprecating-layabout/index.html b/posts/deprecating-layabout/index.html new file mode 100644 index 0000000..34f43db --- /dev/null +++ b/posts/deprecating-layabout/index.html @@ -0,0 +1,37 @@ +Deprecating Layabout
Me

Deprecating Layabout

Since Layabout launched last year it has been downloaded 5,755 times, gotten 16 stars on GitHub, been used by a Portuguese startup to teach a Haskell workshop, and received a Twitter shout-out from @roach, one of the core contributors to the official Python Slack client. During that time the official client library also got a lot better! So much better, in fact, that I've decided to deprecate Layabout.

When?

Layabout is officially deprecated on January 1st, 2020 along with Python 2 (finally 😉). I'll be rolling out documentation changes and adding deprecation warnings in a final release within the next week or so. Technically Layabout will continue to function as long as the API interface supported by the 1.0 Slack library is supported, but you should transition off of it as soon as you are able. I recommend using the 2.0 Slack library instead.

Why?

In May, Rodney Urquhart and contributors rewrote the slackclient library from the ground up in a way that closely matched some of the design decisions I made when creating Layabout. When I announced Layabout I included a quick example of using the framework to build an echo client that would repeat all messages in a given channel. To understand why Layabout no longer provides substantial benefit, let's compare that example written with Layabout to the same application using the 2.0 Slack client.

""" An echo client written using Layabout. """
+from layabout import Layabout
+
+app = Layabout()
+
+
+@app.handle("message")
+def echo(slack: "slackclient.SlackClient", event: dict) -> None:
+    """ Echo all messages seen by the app except our own. """
+    if event.get("subtype") != "bot_message":
+        slack.rtm_send_message(event["channel"], event["text"])
+
+
+if __name__ == "__main__":
+    app.run()
+
""" An echo client written using the official slackclient library. """
+import os
+from slack import RTMClient
+
+
+@RTMClient.run_on(event="message")
+async def echo(**payload) -> None:
+    """ Echo all messages seen by the app except our own. """
+    data = payload["data"]
+    web_client = payload["web_client"]
+
+    if data["subtype"] != "bot_message":
+        web_client.chat_postMessage(
+            channel=data["channel"], text=data["text"], thread_ts=data["thread_ts"]
+        )
+
+
+if __name__ == "__main__":
+    slack_token = os.environ["SLACK_API_TOKEN"]
+    rtm_client = RTMClient(token=slack_token)
+    rtm_client.start()
+

The Layabout example is a little more terse, but neither is a lot of code. They both leverage a decorator pattern for callback registration and support type annotations. The official library also adds async support. It should be relatively straightforward to translate a Layabout application to using the offical library.

Event-oriented decorator API

At the time that I wrote Layabout, the slackclient library provided more of a low-level API, so Layabout functioned as a refinement on reading from the WebSocket connection in a loop. It seems Rodney and I both agreed that the decorator pattern is much more convenient for callback registry. The only important thing that the official library does differently is give you a payload dictionary rather than individual arguments. Both a slack instance and an event are included in that payload, so this is a minor change.

Async

The rewrite of the slackclient library brought with it clever, optional support for Python's async/await syntax. At the time I was writing Layabout I wanted to add support for that myself, but the only available API was synchronous. Interacting with the Slack API via WebSockets lends itself naturally to an asynchonous approach. In my opinion, using the offical library will give a big advantage over Layabout here. If you've structured your Layabout callbacks well and haven't done necessarily synchronous operations, then you should be fine here.

Type annotations

Like Layabout, the slackclient library now has type annotations, which I think are a fantastic addition to Python. If you choose to add a type checker like mypy or Pyre to your toolkit, you can get the benefit of optional static analysis. Layabout no longer provides an extra service since this is supported by the slackclient library directly.

What's Missing?

There are three main pieces of functionality that Layabout still provides that aren't available with the official slackclient library.

  1. The * event

    Layabout supported registering a function to trigger on all events with @app.handle("*"). This was primarily useful in debugging, but I think you can achieve similar results by setting the logging level to logging.DEBUG.

  2. Auto-loading environment variables

    Dealing with credentials is always a pain in the butt, so I designed Layabout to automatically load the Slack API token from $LAYABOUT_TOKEN by default. Or, with app.run(connector=EnvVar("WHATEVER") you could load the token from $WHATEVER environment variable you pleased.

  3. The run until option

    With Layabout you could use app.run(until=...) to provide an arbitary function which would trigger an exit from the application's core loop. I added this to enable cleaner testing with fewer mocks. However, bots are usually meant to be long-running applications, so I'm not sure this was ever useful enough to justify its existence.

These are all handy features, but I don't think any of them are consequential enough to be worth keeping Layabout around. If, going forward, you miss these features, they're probably better suited to be suggestions or pull requests to the official library now.

Thanks

If you used Layabout, thank you. I hope you got as much joy out of it as a user as I got in writing it. I would also like to thank these fine folks for their contributions in getting Layabout off the ground and helping maintain it.

\ No newline at end of file diff --git a/posts/git-filter-wat/git-filter-wat.png b/posts/git-filter-wat/git-filter-wat.png new file mode 100644 index 0000000..3cfba25 Binary files /dev/null and b/posts/git-filter-wat/git-filter-wat.png differ diff --git a/posts/git-filter-wat/index.html b/posts/git-filter-wat/index.html new file mode 100644 index 0000000..679af2d --- /dev/null +++ b/posts/git-filter-wat/index.html @@ -0,0 +1,108 @@ +git filter-wat
Me

git filter-wat

Welcome to this year's annual blog post!

I've been signing git commits for my dotfiles repository since its inception in October of last year, so I was excited to see that GitHub recently added GPG signature verification. All you have to do is upload your public key to GitHub and you'll be verifying commits like a champ. Or so I thought…

Signature Doesn't Match Committer
Unverified commits on GitHub. View full size.

GitHub thinks I'm unverified. I think that's some baloney. I know the public key I uploaded matches the private key I used to sign those commits. Oh, it looks like what they're really concerned about though is that the email on my PGP key doesn't match the email I used with git.

commit 7d74300ee1cd9c2f17a39b143be331cad82fe464
+Author: Reilly Tucker Siemens <reilly.siemens@gmail.com>
+Date:   Sat Mar 5 15:10:18 2016 -0800
+
+    Add cookiecutter configuration.
+
+commit 672f175ba6db1be4f9714f7526c9ff6153c44a81
+Author: Reilly Tucker Siemens <reilly.siemens@gmail.com>
+Date:   Sat Feb 20 14:20:34 2016 -0800
+
+    Add Go to PATH. Make Virtualenv Wrapper config more flexible.
+

Now, before we go any further I should point out I wouldn't be having any problems at all if my user.email matched my GPG key to begin with, but hindsight is 20/20. As it stands, I have a problem and I need to fix it. I need to modify the authorship of these commits to match the email in my GPG key.

How to Fix It

I recently learned that git has a tool called git filter-branch that can be used to make significant and otherwise tedious modifications to git history. A quick trip to Stack Overflow reveals I can use this tool to change the authorship of all the commits in my repository.

Reckless reading leads me to this potential solution.

git filter-branch -f --env-filter "GIT_AUTHOR_EMAIL='reilly@tuckersiemens.com'" HEAD
+
git-filter-wat
A royally screwed up commit log. View full size.

Oh. No. Something is clearly wrong. Looks more like git filter-wat.

How to Actually Fix It

To be honest, I half expected something like this to happen. With the previous command I was never asked to resign these commits. The text of my commit message has been clobbered with the PGP signature. How do I fix that?

Some advocate not signing individual commits and instead just using a signed tag, but I don't subscribe to that idea. I really want each individual commit to be signed.

Additional Stack Overflowing (yes, that's a verb) indicates I can use git filter-branch's --msg-filter and --commit-filter options to strip the PGP signature from the commit message and then resign each commit. This ends up looking like

git filter-branch -f --env-filter "GIT_AUTHOR_EMAIL='reilly@tuckersiemens.com'" --msg-filter 'sed "/iQIcBA.*/,/.*END PGP SIGNATURE.*/d"' --commit-filter 'git commit-tree -S "$@"' HEAD
+

The --msg-filter uses a sed expression to match and delete the PGP signature up to the END PGP SIGNATURE bit. This leaves the rest of the commit object (basically just the message) intact. The modified object is then passed to the git commit-tree in the --commit-filter which then requires me to resign the commit.

Annoyingly, when I actually ran this command I had to sign each and every commit even though I had a gpg-agent running. If anyone can tell me how to avoid that in the future I'd love to know. Luckily it was only 15 commits, but I would find entering a passphrase any more than that rather aggravating.

At this point the unwieldly git filter-branch incantation has been uttered. Let's just double-check the modified commits with a quick git log.

commit 87051d659d16dbe037c9d61dbaaeea38e152a9ff
+Author: Reilly Tucker Siemens <reilly@tuckersiemens.com>
+Date:   Sat Mar 5 15:10:18 2016 -0800
+
+    Add cookiecutter configuration.
+
+commit c56aff44af574bca227587e0f12f5ce841afd2d0
+Author: Reilly Tucker Siemens <reilly@tuckersiemens.com>
+Date:   Sat Feb 20 14:20:34 2016 -0800
+
+    Add Go to PATH. Make Virtualenv Wrapper config more flexible.
+

Looks good to me! Notice that the email isn't the only thing that's changed. The commit hash is also completely different. These aren't modified commits, they're new git objects with the author, date, and commit message preserved. In order to get my changes up to GitHub I'll have to git push --force origin master to blow away my previous history. Most of the time this is probably a bad idea, but this repository exists just for me, so I feel comfortable taking the sledgehammer approach.

Back to Square One

Signature Doesn't Match Committer
These commits are still unverified on GitHub. View full size.

Well, that didn't change anything. What gives? What am I missing? How does GitHub expect this to work in the first place? New idea. What if I set my author email correctly from the get-go and create an entirely new signed commit?

git config --global user.email reilly@tuckersiemens.com
+touch why-doesn\'t-this-work.txt
+git add why-doesn\'t-this-work.txt
+git commit -Sm "Why doesn't this work?"
+git push --force origin master
+
Oh, but it does!
Oh, but it does work! View full size.

Now I think I'm crazy. This works, but why? Something must be different, but what is it? Let's check the git log again.

commit 8f08abcef9f4126ca617b0247c52264a619b049c
+Author: Reilly Tucker Siemens <reilly@tuckersiemens.com>
+Date:   Thu Apr 7 01:12:27 2016 -0700
+
+    Why doesn't this work?
+

How does that look any different from the commit messages above that didn't work? It doesn't appear to be any different, so let's take a deeper look. Borrowing from something I learned while reading the excellent Git Immersion tutorial I made use of git cat-file to inspect the commit objects. Running git cat-file -p 87051d6 shows the commit object for a commit that GitHub won't verify.

tree 8feaf4ea13ac445111d9213cd5f917085e381642
+parent c56aff44af574bca227587e0f12f5ce841afd2d0
+author Reilly Tucker Siemens <reilly@tuckersiemens.com> 1457219418 -0800
+committer Reilly Tucker Siemens <reilly.siemens@gmail.com> 1457219418 -0800
+gpgsig -----BEGIN PGP SIGNATURE-----
+ Version: GnuPG v1
+
+ iQIcBAABAgAGBQJXBhWDAAoJEBtFjnx8sVSpqFoQAKAjWfQ7IfnnUHx/ZuBWdvQt
+ cWt7+LMMmp6OgjATRv8QGoY6GDarLVMNZjhsvtfym5HWrdWk9WhtDqA9EbiLTdhD
+ yFxWhIDVHjqWt5U7QkWWcIYDUhVJ/z8PShPfa9d0Vwjq+HPqziTwILYjoedqBqOr
+ cce8sxSG1GppAuOyiYzqBoVfOC1ko+egh8gsl9pwrMO345dBp5ZMXtyxv4a4v/7s
+ WhY2Ggf71EJ9YTWGBHe2FT8WEH5DjVZZpsFLRlO6BUklKf8PuUPDQVmpgx60L0qW
+ MBmqcw1ftx3vwTAL/foxmE8KkMi5xnIPtUYDdo6d3a2ZeUuWDJBnb+ZxENTLl1DS
+ rwYI/LKbJwZpMfegwnHaJFgBF7igM7poeP3pytN2qzXYRGyXkJPYYh8Di5/alaGt
+ rp0rtlJJ2tk2m+V9MiqrO8HJoZrt1Y5z/Pg+Fo1yJdB+97fjYJfjnQ7+nTIgkfoB
+ hQlIc/G+w194GEN1AO4P7CeCvXh3necNPKsUZw0BfXrRjEIKGb5Qrs31xY1AJtup
+ 8prxg4jV3EmKBzKng3E65QHTPAjQWl0FhdvI2Qd2ea+fpjSbTKRDSmdi8ghHb6C2
+ Q1azXowhVOoqodINE7wc4OpDsc9hLzCUdY1z8iBgMzsYxjSwJGerRpAWT8mjmoK1
+ Z1wjETnbqvu8FXuVgqpW
+ =CYyg
+ -----END PGP SIGNATURE-----
+
+
+Add cookiecutter configuration.
+

Running git cat-file -p 8f08abc shows the commit object for a commit GitHub will verify.

tree 4dd20b4bf0b143ef3b0ed73c7232b9cf3da669e5
+parent 7d74300ee1cd9c2f17a39b143be331cad82fe464
+author Reilly Tucker Siemens <reilly@tuckersiemens.com> 1460016747 -0700
+committer Reilly Tucker Siemens <reilly@tuckersiemens.com> 1460016747 -0700
+gpgsig -----BEGIN PGP SIGNATURE-----
+ Version: GnuPG v1
+
+ iQIcBAABAgAGBQJXBhZrAAoJEBtFjnx8sVSpZkMP/jQk4lT+b0kpOj+VfvW3JREH
+ R5ghCTJneZTlQzZcgtvN2ztQXG3Unn2A0YpaoCWC6gve0uv3W75JbJ6//Jtm/udq
+ LP2iiZO8Pp7QdaEvGKL4c0nw/GgBYNicOcL61QWR1ymoK4d3FTGU3dEYMOOWN673
+ vR4DVvv2DGD2OO3VAjpXmznJBGER5k5dtQ5asScWfYej2hEXQfESrWCT1BiXtqxA
+ d/ge92C7t4zMFHs+LYdnXGoRYahQyCTfiIPaDQ9XdDREYMiA0dj6uahKWPhKzYnK
+ 89qXphF3PN1huJKN31eTANuiA2Pt3Swe/RYHOv+l8PcInFZWcmF7uQQ7Eivh+Hi7
+ lO9l7XR9qIiW9r9890V25F2ESTSxoHMpcZDyV9lTDUYEBJgsP6v1C4JG2CvYVkkL
+ vvqVb3CMldDdNvLnavFxmEmIPDNMDLrZR0s0yc5FdYBsADw0VG6QwG3j/IdSyOyH
+ t4QzlFqGelm5vUBiqmZxOJE90LRI4e2876ZI5VPYmJ49mPpU4qNkRMQvVZLwtqOe
+ mm616Ja5IEivM/1BKWIId9kTPB4/TzdgTRR6OJYwKcbdkiSdRIdhWbSe4c1VsTpd
+ YQ6zpzg63/Mm4N3I/4pNXY3AGOa4JtSttRtTjeLnXyStZ47AwthcTB6ioIgLGJQ7
+ 8tEpiPvkr127Mj6VeMFv
+ =AgDl
+ -----END PGP SIGNATURE-----
+
+Why doesn't this work?
+

Whoa. While this may have already been obvious to some of you I had no idea that git objects had separate authors and committers. GitHub was right all along. The email in my signature doesn't match the committer email. I'll bet I can leverage git filter-branch again to finally fix this.

How to Actually Actually Fix It (A.K.A Tell Git Who's Boss)

Just as there is a GIT_AUTHOR_EMAIL environment variable to use in a filter, there is also a GIT_COMMITTER_EMAIL. Now I can simply

git filter-branch -f --env-filter "GIT_AUTHOR_EMAIL='reilly@tuckersiemens.com'; GIT_COMMITTER_EMAIL='reilly@tuckersiemens.com'" --msg-filter 'sed "/iQIcBA.*/,/.*END PGP SIGNATURE.*/d"' --commit-filter 'git commit-tree -S "$@"' HEAD
+

and voila!, git cat-file -p fd23e7c shows a commit object with the correct author and commiter.

tree 8feaf4ea13ac445111d9213cd5f917085e381642
+parent 244b168ef6fc8fb5aef6abf4a68426299c00e0f1
+author Reilly Tucker Siemens <reilly@tuckersiemens.com> 1457219418 -0800
+committer Reilly Tucker Siemens <reilly@tuckersiemens.com> 1457219418 -0800
+gpgsig -----BEGIN PGP SIGNATURE-----
+ Version: GnuPG v1
+
+ iQIcBAABAgAGBQJXBhuzAAoJEBtFjnx8sVSpXY4QAK5mkQbuAplY9e7FcwR5CB2Q
+ 7ZwgKBJccmkBxWaH6UTsrFJzMXBzpIXnkycUStGMD0tduNo/jxnK19QBPaDVuQ0C
+ oD6RIIgTEsuJ81IserBwILryr6G7MBWQp5qWbXrCztN+SAXg7S3Rh235S6t64HtW
+ FwZ8gBWz+tUhr9ysrOYEilXYDiltO5QRHrVbE0QBGV0FRVHgSlnUeChZaTiYT5+T
+ pHHBezipNqMTnbiRGyc8/yfrfD32YljSRrZKH4ly4sNdUklJKUraoaIZNojybk2f
+ DxZmXgvHlcfIJJO9WzL2KEoCpWMg8hQXM1CQf7u+98hBcWe/1J8E2Wt6mbo04r9J
+ uPYyerLoIgKqAKJj4CZeFCPOzl3N9OPQHTN1aamq6/td5E8MRTxf2vxHP2GS/vOX
+ yFNjBlxy63u1yBi6u79iKvdzjG933Z/MYONOAnSxHvk1Ka79lIh4G49Gk7AMUlON
+ IxJ/PRzf8CjTzz6jaoZQIHG+BkEHXIiT2YZRSNKA4vYyL/iKj9OmXK/0SeSDeELh
+ /4uByx37dgAYW9hZho3d2+BiW5pVsaDOUNLpStu+c4u31juZMlM5OkskvCpAI00a
+ /1F8s1LraG9GxEkyeWNnAm9wV4JWkBjSvfUraj2jeHdotkIpJ4FEOSXpm+wHySX0
+ wd1/IAJz3DesEmKlTIMW
+ =LZcd
+ -----END PGP SIGNATURE-----
+
+
+Add cookiecutter configuration.
+

Unfortunately, now the graph of my git log looks weird. I have double the commits!

Messed Up Git Graph
Not quite there yet. View full size.

Success

For the third time Stack Overflow saves my bacon and

git update-ref -d refs/original/refs/heads/master
+

cleans up my graph. Now I can force push to origin master again and everything will be right again.

Verified Commits
Verified commits on GitHub. Success! View full size.
\ No newline at end of file diff --git a/posts/git-filter-wat/messed-up-git-graph.png b/posts/git-filter-wat/messed-up-git-graph.png new file mode 100644 index 0000000..b3beea4 Binary files /dev/null and b/posts/git-filter-wat/messed-up-git-graph.png differ diff --git a/posts/git-filter-wat/oh-but-it-does.png b/posts/git-filter-wat/oh-but-it-does.png new file mode 100644 index 0000000..1578290 Binary files /dev/null and b/posts/git-filter-wat/oh-but-it-does.png differ diff --git a/posts/git-filter-wat/signature-doesnt-match-committer.png b/posts/git-filter-wat/signature-doesnt-match-committer.png new file mode 100644 index 0000000..3b5a56c Binary files /dev/null and b/posts/git-filter-wat/signature-doesnt-match-committer.png differ diff --git a/posts/git-filter-wat/verified-commits.png b/posts/git-filter-wat/verified-commits.png new file mode 100644 index 0000000..726ae1f Binary files /dev/null and b/posts/git-filter-wat/verified-commits.png differ diff --git a/posts/gutenberg-init-blog/index.html b/posts/gutenberg-init-blog/index.html new file mode 100644 index 0000000..aed8906 --- /dev/null +++ b/posts/gutenberg-init-blog/index.html @@ -0,0 +1 @@ +gutenberg init blog
Me

gutenberg init blog

When I first created this site I wanted to get it live as quickly as possible. Hexo, a blogging framework written in Node.js, seemed like the perfect tool. At the time I was rather interested in Node.js, so it seemed natural to use a framework rooted in that community.

By the time of my last post I'd become increasingly disinterested in Node.js and much more interested in Rust and its community. It was mostly procrastination, but I convinced myself that using a tool written in a language I didn't use often directly contributed to the paucity of posts here, so I finally decided to ditch Hexo.

Replacement Criteria

Of course, I needed a suitable replacement. I wanted it to

Fast and flexible are nebulous requirements. Most static site generators are more than fast enough since I'm not generating a site with hundreds of pages (yet). My early experience with Hexo taught me that while having a bunch of features out of the box is nice flexibility is more important. I needed to be able to choose my own deployment mechanisms and structure my site the way I wanted to.

I don't personally use an RSS feed reader, but I know friends of mine do and they've asked me to support that (here ya go). I sure as heck didn't want to generate an RSS feed myself.

Live reloading and Markdown support were non-negotiable requirements. The write, build, and reload cycle is too tedious to forgo live reloading. I simply didn't want to use a markup language other than Markdown because writing it is effortless for me at this point.

The Hunt

I surveyed the landscape, but didn't find anything to my liking, so I procrastinated even more. Then I happened across an article linked from an orange website introducing Tera, a template engine in Rust. When I saw that the author drew inspiration from Python's Jinja2 templating library I got really excited.

I tried my hand at writing my own static site generator in Rust. I managed to get a rudimentary POC using Tera operational just in time for Vincent Prouillet, the author of Tera, to announce Gutenberg. It met all my replacement criteria, used the templating engine I was interested in, and was written in Rust!

Building My New Site

Once I found Gutenberg I attacked fixing up my site with vigor. I wrote my own templates, learned a little Sass, and even managed to make a small contribution to Gutenberg. Hexo made my site largely a black box, but now there isn't a line of the source I haven't touched.

As an added bonus, I took the redesign opportunity to ensure there's no JavaScript anywhere on this site, so this page renders exactly the same regardless of whether you're using a JavaScript blocker. In the future I may relax this constraint, but I'm happy with my decision so far and I was able to learn more about Flexbox and CSS3 as a result.

I also had time to add custom 404 Not Found and 50x Server Error pages. If you're interested in more little details, you can find the source for the redesigned site on GitHub.

Credit Where Credit Is Due

This would not have been possible without Vincent's hard work on Tera and Gutenberg. I also borrowed a great deal from Alex Sun's vida Jekyll theme in writing this site's Sass. I've learned a lot from both of them. Thanks!

What's Next?

This site is in a much better place than it was a year ago. I understand it better and I'm more motivated to continue working on it, so expect more posts!

\ No newline at end of file diff --git a/posts/hexo-init-blog/index.html b/posts/hexo-init-blog/index.html new file mode 100644 index 0000000..c062354 --- /dev/null +++ b/posts/hexo-init-blog/index.html @@ -0,0 +1,17 @@ +hexo init blog
Me

hexo init blog

I've been wanting to start a blog for a long time now. Today I'm pulling the trigger on that with a simple hexo init blog. Well, it wasn't that simple, so I feel like it's worth talking about a few of the complications I had.

Hexo, my chosen blogging framework, has made some interesting decisions. Running tree -aACL 1 shows the directory structure for my new Hexo blog.

.
+├── _config.yml
+├── db.json
+├── .deploy_git
+├── .gitignore
+├── node_modules
+├── package.json
+├── public
+├── scaffolds
+├── source
+└── themes
+
+6 directories, 4 files
+

Hexo keeps its generated static files in the public directory. When deploying with git, as I choose to do, it uses the hidden .deploy_git directory to keep changes to public under version control. When generating static files Hexo obliterates anything in the public directory it doesn't know about. This means that even if the user wants to add some static files of their own they'll just be baleeted when Hexo clears out the .deploy_git directory and copies the public directory in.

This isn't a big deal if you don't care about having any static files not generated by Hexo on your site. It just so happens that I'm very keen on having my proof for Keybase. In order to prove ownership of a website Keybase requires that you keep a file called keybase.txt or .well-known/keybase.txt from the root of your site.

So, how do you serve a custom static file when your blogging framework is intent on keeping you from doing so? I suppose I could do the responsible thing and submit a pull request upstream for a configuration option to preserve certain files, but I'm much too lazy for that. Instead I've enlisted nginx for a band-aid solution.

location =/keybase.txt {
+    root /srv;
+}
+

Most requests to my site pass through nginx to my default root, but any requests for keybase.txt get served out of a separate directory where it safely sits.

\ No newline at end of file diff --git a/posts/index.html b/posts/index.html new file mode 100644 index 0000000..266fbd0 --- /dev/null +++ b/posts/index.html @@ -0,0 +1 @@ +Posts
Me

Posts

avatar.png

No, not that Avatar. And not the other one either. This post is about avatar.png, a handful of lines of PHP that have inspired me for a long time.

Around 2011 or 2012 a friend of mine, Andrew Kvalheim, blew my mind when he made his Skype profile picture display the IP address of the computer I was using. It might have looked a bit like this.

Parsing TFTP in Rust

Several years ago I did a take-home interview which asked me to write a TFTP server in Go. The job wasn't the right fit for me, but I enjoyed the assignment. Lately, in my spare time, I've been tinkering with a Rust implementation. Here's what I've done to parse the protocol.

A Fresh Coat of Paint

I'm starting the new year with a new job. To paraphrase a friend, "it's just moving from one $BIGCORP to another", but it's still exciting. I worked my last gig for 5 years, so I'm nervous, but also very ready to do something new. While I'm doing one new thing I might as well do another. Taking some time off between jobs has given me enough breathing room to redo my website.

node.example.com Is An IP Address

Hello! Welcome to the once-yearly blog post! This year I'd like to examine the most peculiar bug I encountered at work. To set the stage, let's start with a little background.

Deprecating Layabout

Since Layabout launched last year it has been downloaded 5,755 times, gotten 16 stars on GitHub, been used by a Portuguese startup to teach a Haskell workshop, and received a Twitter shout-out from @roach, one of the core contributors to the official Python Slack client. During that time the official client library also got a lot better! So much better, in fact, that I've decided to deprecate Layabout.

Announcing Layabout

Today I'm announcing Layabout, my first official Python library. Layabout is a small event handling library on top of the Slack Real Time Messaging (RTM) API. You can get it right now on PyPI.

Resolving A DNS Issue

Haha. Get it? Resolving a DNS issue. OK, that was bad. You don't have to read anymore, but I'm SOA into this. You might even say I'm in the zone. I think it's gonna be A great read, so consider sticking around, 'cuz there's no TLD;R.

Stateful Callbacks in Python

If you're unfamiliar with what a callback is, don't worry, we can sort that out quickly. If callbacks are old hat for you you might want to skip to the interesting bit.

Simply put, a callback is a function that is passed as an argument to another function which may execute it.

gutenberg init blog

When I first created this site I wanted to get it live as quickly as possible. Hexo, a blogging framework written in Node.js, seemed like the perfect tool. At the time I was rather interested in Node.js, so it seemed natural to use a framework rooted in that community.

By the time of my last post I'd become increasingly disinterested in Node.js and much more interested in Rust and its community. It was mostly procrastination, but I convinced myself that using a tool written in a language I didn't use often directly contributed to the paucity of posts here, so I finally decided to ditch Hexo.

git filter-wat

Welcome to this year's annual blog post!

I've been signing git commits for my dotfiles repository since its inception in October of last year, so I was excited to see that GitHub recently added GPG signature verification. All you have to do is upload your public key to GitHub and you'll be verifying commits like a champ. Or so I thought…

hexo init blog

I've been wanting to start a blog for a long time now. Today I'm pulling the trigger on that with a simple hexo init blog. Well, it wasn't that simple, so I feel like it's worth talking about a few of the complications I had.

\ No newline at end of file diff --git a/posts/node-example-com-is-an-ip-address/confused-jeff-bridges.webp b/posts/node-example-com-is-an-ip-address/confused-jeff-bridges.webp new file mode 100644 index 0000000..2de2c7b Binary files /dev/null and b/posts/node-example-com-is-an-ip-address/confused-jeff-bridges.webp differ diff --git a/posts/node-example-com-is-an-ip-address/index.html b/posts/node-example-com-is-an-ip-address/index.html new file mode 100644 index 0000000..25ab5d5 --- /dev/null +++ b/posts/node-example-com-is-an-ip-address/index.html @@ -0,0 +1,161 @@ +node.example.com Is An IP Address
Me

node.example.com Is An IP Address

Hello! Welcome to the once-yearly blog post! This year I'd like to examine the most peculiar bug I encountered at work. To set the stage, let's start with a little background.

When we write URLs with a non-standard port we specify the port after a :. With hostnames and IPv4 addresses this is straightforward. Here's some Python code to show how easy it is.

>>> url = urllib.parse.urlparse("https://node.example.com:8000")
+>>> (url.hostname, url.port)
+('node.example.com', 8000)
+>>>
+>>> url = urllib.parse.urlparse("https://192.168.0.1:8000")
+>>> (url.hostname, url.port)
+('192.168.0.1', 8000)
+

Unfortunately, when IPv6 addresses are involved some ambiguity is introduced.

>>> url = urllib.parse.urlparse(
+...     "https://fdc8:bf8b:e62c:abcd:1111:2222:3333:4444:8000"
+... )
+...
+>>> url.hostname
+'fdc8'
+>>> try:
+...     url.port
+... except ValueError as error:
+...     print(error)
+...
+Port could not be cast to integer value as 'bf8b:e62c:abcd:1111:2222:3333:4444:8000'
+

Since IPv6 addresses use a "colon-hex" format with hexadecimal fields separated by : we can't tell a port apart from a normal field. Notice in the example above that the hostname is truncated after the first :, not the one just before 8000.

Fortunately, the spec for URLs recognizes this ambiguity and gives us a way to handle it. RFC 2732 (Format for Literal IPv6 Addresses in URL's) says

To use a literal IPv6 address in a URL, the literal address should be enclosed in "[" and "]" characters.

Update our example above to include [ and ] and voilà! It just works.

>>> url = urllib.parse.urlparse(
+...     "https://[fdc8:bf8b:e62c:abcd:1111:2222:3333:4444]:8000"
+... )
+...
+>>> (url.hostname, url.port)
+('fdc8:bf8b:e62c:abcd:1111:2222:3333:4444', 8000)
+

Armed with that knowledge we can dive into the problem. 🤿

Works On My Machine

A few months ago a co-worker of mine wrote a seemingly innocuous function.

from ipaddress import ip_address
+
+
+def safe_host(host): 
+    """Surround `host` with brackets if it is an IPv6 address."""
+    try:
+        if ip_address(host).version == 6:
+            return "[{}]".format(host)
+    except ValueError:
+        pass
+    return host
+

Elsewhere in the code it was invoked something like this, so that hostnames, IPv4 addresses, and IPv6 addresses could all be safely interpolated.

url = "https://{host}:8000/some/path/".format(host=safe_host(host))
+

Since my co-worker is awesome they wrote tests to validate their code. ✅

def test_safe_host_with_hostname():
+    """Hostnames should be unchanged."""
+    assert safe_host("node.example.com") == "node.example.com"
+
+
+def test_safe_host_with_ipv4_address():
+    """IPv4 addresses should be unchanged."""
+    assert safe_host("192.168.0.1") == "192.168.0.1"
+
+
+def test_safe_host_with_ipv6_address():
+    """IPv6 addresses should be surrounded by brackets."""
+    assert (
+        safe_host("fdc8:bf8b:e62c:abcd:1111:2222:3333:4444")
+        == "[fdc8:bf8b:e62c:abcd:1111:2222:3333:4444]"
+    )
+

Thank goodness they did. The Python 2 tests failed (don't look at me like that 😒).

FAIL py27 in 1.83 seconds
+OK py36 in 2.82 seconds
+OK py37 in 2.621 seconds
+OK py38 in 2.524 seconds
+OK py39 in 2.461 seconds
+

Both the hostname and IPv6 address tests failed. But why did they fail? And why did the Python 3 tests pass? 🤔

We'll start with the hostname failure and try to isolate the bug.

E       AssertionError: assert '[node.example.com]' == 'node.example.com'
+E         - [node.example.com]
+E         ? -                -
+E         + node.example.com
+

The failure says node.example.com was surrounded by brackets, but that's only supposed to happen for IPv6 addresses! Let's crack open a Python 2 interpreter for a quick sanity check.

>>> ipaddress.ip_address("node.example.com").version
+6
+
Confused Jeff Bridges

What On Htrae?

If, like Jeff Bridges, you were confused by that result, relax. We're probably not in a Bizarro World where node.example.com is a valid IPv6 address. There must be an explanation for this behavior.

Things start to become a little more clear when we see the result of the ip_address() function for ourselves.

>>> ipaddress.ip_address("node.example.com")
+IPv6Address(u'6e6f:6465:2e65:7861:6d70:6c65:2e63:6f6d')
+

At first glance that looks like madness. Python 3 behaves in an entirely different manner.

>>> try:
+...     ipaddress.ip_address("node.example.com")
+... except ValueError as error:
+...     print(error)
+... 
+'node.example.com' does not appear to be an IPv4 or IPv6 address
+

Python 3 knows that's not an IPv6 address, so why doesn't Python 2? The answer is in how differently the two Python versions handle text.

Text Is Hard

Computers don't operate on text as humans think of it. They operate on numbers. That's part of why we have IP addresses to begin with. In order to represent human-readable text with computers we had to assign meaning to the numbers. Thus, ASCII was born.

ASCII is a character encoding, which means it specifies how to interpret bytes as text we understand (provided you speak English). So, when your computer sees 01101110 in binary (110 in decimal) you see n because that's what ASCII says it is.

You can see the number to text conversion in action right in the Python interpreter.

>>> ord("n")
+110
+>>> chr(110)
+'n'
+

In fact, it doesn't matter what numbering system you use. If you specify binary, octal, decimal, hexadecimal, whatever... If it can be understood as the right integer it will be displayed correctly.

>>> chr(0b01101110)
+'n'
+>>> chr(0o156)
+'n'
+>>> chr(110)
+'n'
+>>> chr(0x6e)
+'n'
+

Neat, but what does that information do for us?

It's Numbers All The Way Down

Just for giggles, humor me and let's look at the character-number translations for node.example.com. We'll leave out binary and octal, because they make this table uglier than it already is.
Characternode.example.com
Decimal11011110010146101120971091121081014699111109
Hexadecimal6e6f64652e6578616d706c652e636f6d

Hey, hold on a second... If you tilt your head sideways and squint that last row looks kinda like an IPv6 address, doesn't it?

We should verify, just to be absolutely certain. You've still got that Python 2 interpreter open, right?

>>> # Convert the characters in the hostname to hexadecimal.
+>>> hostname = "node.example.com"
+>>> hostname_as_hexadecimal = "".join(hex(ord(c))[2:] for c in hostname)
+>>> hostname_as_hexadecimal
+'6e6f64652e6578616d706c652e636f6d'
+>>>
+>>> # Convert the "IP address" to text.
+>>> address = ipaddress.ip_address(hostname)
+>>> str(address)
+'6e6f:6465:2e65:7861:6d70:6c65:2e63:6f6d'
+>>>
+>>> # Remove the colons from that text.
+>>> address_without_colons = str(address).replace(":", "")
+>>> address_without_colons
+'6e6f64652e6578616d706c652e636f6d'
+>>>
+>>> # Compare the results and see they're equal.
+>>> hostname_as_hexadecimal == address_without_colons
+True
+

Sure enough, when you boil them both down to numbers they're the same mess of hexadecimal.

The Belly Of The Beast

If we dig into the source code for the Python 2 version of the ipaddress module we ultimately come to a curious set of lines.

# Constructing from a packed address
+if isinstance(address, bytes):
+    self._check_packed_address(address, 16)
+    bvs = _compat_bytes_to_byte_vals(address)
+    self._ip = _compat_int_from_byte_vals(bvs, 'big')
+    return
+

It turns out that, under certain conditions, the ipaddress module can create IPv6 addresses from raw bytes. My assumption is that it offers this behavior as a convenient way to parse IP addresses from data fresh off the wire.

Does node.example.com meet those certain conditions? You bet it does. Because we're using Python 2 it's just bytes and it happens to be 16 characters long.

>>> isinstance("node.example.com", bytes)
+True
+>>> # `self._check_packed_address` basically just checks how long it is.
+>>> len("node.example.com") == 16
+True
+

The rest of the ipaddress lines say to interpret the sequence of bytes as a big-endian integer. That's magic best left for another blog post, but the gist is that hexadecimal interpretation of node.example.com is condensed into a single, huge number.

>>> int("6e6f64652e6578616d706c652e636f6d", 16)
+146793460745001871434687145741037825901L
+

That's an absolutely massive number, but not so massive it won't fit within the IPv6 address space.

>>> ip_address(146793460745001871434687145741037825901L)
+IPv6Address(u'6e6f:6465:2e65:7861:6d70:6c65:2e63:6f6d')
+

As it turns out, if you're liberal in your interpretation, node.example.com can be an IPv6 address!

You Will Be Reading Meanings

Obviously that's hogwash. Bizarro might be proud, but that's not what we wanted to happen.

There's a quote about numbers which is apocryphally attributed to W.E.B. Du Bois, but that actually comes from Harold Geneen's book, Managing.

When you have mastered the numbers, you will in fact no longer be reading numbers, any more than you read words when reading a book. You will be reading meanings.

Having not read the book I'm probably taking the quote way out of context, but I think it fits our situation well.

As we've seen above, we can freely convert characters to numbers and back again. The root of our problem is that when we use Python 2 it considers text to be bytes. There's not a deeper, inherent meaning. Maybe the bytes are meant to be ASCII, maybe they're meant to be a long number, maybe they're meant to be an IP address. The interpretation of those bytes is up to us.

Python 2 doesn't differentiate between bytes and text by default. In fact, the bytes type is just an alias for str.

>>> bytes
+<type 'str'>
+>>> bytes is str
+True
+

To make that even more concrete, see how Python 2 considers n to be the same as this sequence of raw bytes.

>>> "n" == b"\x6e"
+True
+

Our Python 2 code doesn't work the way we want it to because raw bytes can have arbitrary meaning and we haven't told it to use our intended meaning.

So now we know why Python 2 interprets node.example.com as an IPv6 address, but why does Python 3 behave differently? More importantly, how can we reconcile the two?

256 Characters Ought To Be Enough For Anybody

ASCII looked like a good idea in the 1960's. With decades of hindsight we know the 256 characters afforded to us by Extended ASCII are insufficient to handle all of the world's writing systems. Thus, Unicode was born.

There are scads of blog posts, Wikipedia articles, and technical documents that will do a better job than I can of explaining Unicode in detail. You should read them if you care to, but here's my gist.

Unicode is a set of character encodings. UTF-8 is the dominant encoding. UTF-8 overlaps with ASCII, so ASCII characters are still just one byte. To handle the multitude of other characters, however, multiple bytes can express a single character.

>>> "n".encode("utf-8").hex()  # 1 character (U+006E), 1 byte.
+'6e'
+>>> "🤿".encode("utf-8").hex()  # 1 character (U+1F93F), 4 bytes.
+'f09fa4bf'
+>>> "悟り".encode("utf-8").hex()  # 2 characters (U+609F, U+308A), 6 bytes.
+'e6829fe3828a'
+

Every programming language I know of that respects the difference between raw bytes and Unicode text maintains a strict separation between the two datatypes.

In Python 3 this strict separation is enabled by default. Notice that it doesn't consider n and this sequence of raw bytes to be the same thing.

>>> "n" == b"\x6e"
+False
+

Even better, it doesn't consider str and bytes to be the same type.

>>> bytes is str
+False
+>>> bytes
+<class 'bytes'>
+

If we can get Python 2 to understand Unicode like Python 3 does, then we can probably fix our bug.

As an aside, if you want to learn more about how to handle Unicode in Python, check out Ned Batchelder's talk on Pragmatic Unicode.

How Did We Fix It?

Python 2 does actually know about Unicode, but it considers Unicode text to be separate from "normal" text. At some point in Python 2 history the unicode type was bolted onto the side of the language and not enabled by default. Hard to get excited about it, but it does the trick. At least they knew it's a pain to type unicode() all the time, so there's a handy literal syntax using a u prefix.

>>> unicode("node.example.com") == u"node.example.com"
+True
+

This is not the best fix, but it did in a pinch. We added a line converting the hostname to Unicode right off the bat. We also applied the same transformation to the line with brackets. This way we always process the hostname as Unicode and we always return a Unicode value.

 def safe_host(host):
+     """Surround `host` with brackets if it is an IPv6 address."""
++    host = u"{}".format(host)
+     try:
+         if ip_address(host).version == 6:
+-            return "[{}]".format(host)
++            return u"[{}]".format(host)
+     except ValueError:
+         pass
+

Luckily for us the u prefix also works in Python 3 whereas unicode() does not (because all text is Unicode by default, so the type has no business existing). In Python 3 the u is treated as a no-op.

The Python 2 interpreter graciously understands the unicode type is not just raw bytes.

>>> isinstance(u"node.example.com", bytes)
+False
+

When we use the unicode type the ipaddress module no longer tries to interpret node.example.com as bytes and convert those bytes to an IP address. We get just what we expect

>>> try:
+...     ipaddress.ip_address(u"node.example.com")
+... except ValueError as error:
+...     print(error)
+... 
+u'node.example.com' does not appear to be an IPv4 or IPv6 address
+

and our tests pass!

OK py27 in 1.728 seconds
+OK py36 in 2.775 seconds
+OK py37 in 2.717 seconds
+OK py38 in 2.674 seconds
+OK py39 in 2.506 seconds
+

Reflection

I mentioned above that our fix wasn't the best. Given more time, how can we do better?

The first (and best) solution here is to drop Python 2 support. It's 2020 now and Python 2 is officially no longer supported. The original code worked on Python 3. The best long-term decision is to migrate the code to run on Python 3 only and avoid the hassle of Python 2 maintenance. Unfortunately many of the people running this code still depend on it working on Python 2, so we'll have to make that transition gracefully.

If a migration away from Python 2 isn't possible in the near-term, the next best thing to do is update our code so that it uses a compatibility layer like future or six. Those libraries are designed to modernize Python 2 and help smooth over issues like this one.

It also wouldn't hurt for us to take a page from Alexis King's Parse, don't validate school of thought. When the hostname enters our program via user input it should immediately be converted to the unicode type (or maybe even an IP address type) so we don't end up solving this problem in several different places throughout the code.

Finally, though our program doesn't currently handle any hostnames in languages other than English, it's probably best to be thinking in Unicode anyway. Again, it's 2020 and internationalized domain names like https://Яндекс.рф are a thing.

If you made it this far, thanks for reading. It was fun to turn a brief debugging session with my co-worker into a treatise on the perils of Python 2 and the value of Unicode. See you next year! 😂

\ No newline at end of file diff --git a/posts/parsing-tftp-in-rust/index.html b/posts/parsing-tftp-in-rust/index.html new file mode 100644 index 0000000..babef96 --- /dev/null +++ b/posts/parsing-tftp-in-rust/index.html @@ -0,0 +1,252 @@ +Parsing TFTP in Rust
Me

Parsing TFTP in Rust

Several years ago I did a take-home interview which asked me to write a TFTP server in Go. The job wasn't the right fit for me, but I enjoyed the assignment. Lately, in my spare time, I've been tinkering with a Rust implementation. Here's what I've done to parse the protocol.

Caveat Lector

It's natural to write a technical blog post like this in a somewhat authoritative tone. However, I am not an authority. There will be mistakes. Techniques, libraries, and even protocols change over time. Keep in mind that I am learning too and will happily accept corrections and critiques.

Why Rust?

Much has been written on the merits of Rust by more qualified people. I encourage you to seek their writing and make your own decisions. For my part, I try my best to write fast, safe, and correct code. Rust lets me be more confident about my solutions without the baggage (and danger) of the last 40 years of C/C++. Recent statements and events would seem to agree.

If you know me, you might be surprised that this is my first post on Rust since I've been hyping up the language for the last 7 years. Better late than never. 😂

What Is TFTP?

If you already know the ins and outs of TFTP feel free to skip to the type design or parsing sections.

For those who don't know, TFTP is the Trivial File Transfer Protocol, a simple means of reading and writing files over a network. Initially defined in the early 80s, the protocol was updated by RFC 1350 in 1992. In this post I'll only cover RFC 1350. Extensions like RFC 2347, which adds a 6th packet type, won't be covered.

Security

TFTP is not a secure protocol. It offers no access controls, no authentication, no encryption, nothing. If you're running a TFTP server assume that any other host on the network can read the files hosted by it. You should not run a TFTP server on the open Internet.

Why Use TFTP?

If TFTP is old, insecure, and protocols like HTTP & SSH exist, you might wonder why you'd even bother. Fair enough. If you have other options, you probably don't need to use it.

That said, TFTP is still widely used, especially in server and lab environments where there are closed networks. Combined with DHCP and PXE it provides an efficient means of network booting due to its small memory footprint. This is especially important for embedded devices where memory is scarce. Additionally, if your server supports the experimental multicast option with RFC 2090, files can be read by multiple clients concurrently.

Protocol Overview

TFTP is implemented atop UDP, which means it can't benefit from the retransmission and reliability inherent in TCP. Clients and servers must maintain their own connections. For this reason operations are carried out in lock-step, requiring acknowledgement at each point, so that nothing is lost or misunderstood.

Because files might be larger than what can fit into a single packet or even in memory, TFTP operates on chunks of a file, which it calls "blocks". In RFC 1350 these blocks are always 512 bytes or less, but RFC 1783 allows clients to negotiate different sizes which might be better on a particular network.

By default, initial requests are received on port 69, the offical port assigned to TFTP by IANA. Thereafter, the rest of a transfer is continued on a random port chosen by the server. This keeps the primary port free to receive additional requests.

Reading

To read a file, a client sends a read request packet. If the request is valid, the server responds with the first block of data. The client sends an acknowledgement of this block and the server responds with the next block of data. The two continue this dance until there's nothing more to read.

A sequence diagram for a TFTP read request.

Writing

Writing a file to a server is the inverse of reading. The client sends a write request packet and the server responds with an acknowledgement. Then the client sends the first block of data and the server responds with another acknowledgement. Rinse and repeat until the full file is transferred.

A sequence diagram for a TFTP write request.

Errors

Errors are a valid response to any other packet. Most, if not all, errors are terminal. Errors are a courtesy and are neither acknowledged nor retransmitted.

Packet Types

To cover the interactions above, RFC 1350 defines five packet types, each starting with a different 2 byte opcode. I'll elaborate on each of them in turn.
OpcodeOperationAbbreviation
1Read RequestRRQ
2Write RequestWRQ
3DataDATA
4AcknowledgementACK
5ErrorERROR

RRQ / WRQ

Read and write requests share a representation, differing only by opcode. They contain a filename and a mode as null-terminated strings.
2 bytesstring1 bytestring1 byte
opcodefilename0mode0

Here's an example of the raw bytes in an RRQ for a file called foobar.txt in octet mode.

let rrq = b"\x00\x01foobar.txt\x00octet\x00";

And here's a WRQ for the same file in the same mode.

let wrq = b"\x00\x02foobar.txt\x00octet\x00";

Modes

TFTP defines modes of transfer which describe how the bytes being transferred should be handled on the other end. There are three default modes.
ModeMeaning
netascii8-bit ASCII; specifies control characters & line endings
octetraw 8-bit bytes; byte-for-byte identical on both ends
mailemail the bytes to a user; obsolete even in 1992

The protocol allows for other modes to be defined by cooperating hosts, but I can't recommend that. Honestly, octet mode is probably sufficient for most modern needs.

DATA

Data packets contain the block number being sent and the corresponding data as raw bytes.
2 bytes2 bytesn bytes
opcodeblock #data

Here's an example of the raw bytes in a DATA packet for the first block of a transfer with the contents Hello, World!.

let data = b"\x00\x03\x00\x01Hello, World!";

ACK

Acknowledgements need only contain the block number they correspond to.
2 bytes2 bytes
opcodeblock #

Here's an example of the raw bytes in an ACK packet for the first block of a transfer.

let ack = b"\x00\x04\x00\x01";

ERROR

Errors contain a numeric error code and a human-readable, null-terminated string error message.
2 bytes2 bytesstring1 byte
opcodeerror codeerror message0

Here's an example of the raw bytes in an ERROR packet for a "File not found" error.

let error = b"\x00\x05\x00\x01File not found\x00";

By default, TFTP defines eight error codes. Since the error code is a 16-bit integer there's enough space for you and your friends to define 65,528 of your own. In practice, maybe don't.
ValueMeaning
0Not defined, see error message (if any).
1File not found.
2Access violation.
3Disk full or allocation exceeded.
4Illegal TFTP operation.
5Unknown transfer ID.
6File already exists.
7No such user.
......
65,535Go wild, do whatever.

Type Design

Now we all know entirely too much about TFTP. Let's write some code already!

Before I start parsing anything I find it helpful to design the resulting types. Even in application code I put on my library developer hat so I'm not annoyed by my own abstractions later.

Let's motivate this design by looking at some code that would use it.

let mut buffer = [0; 512];
+let socket = UdpSocket::bind("127.0.0.1:6969")?;
+let length = socket.recv(&mut buffer)?;
+
+let data = &buffer[..length];
+todo!("Get our packet out of data!");
+

In both std::net::UdpSocket and tokio::net::UdpSocket the interface that we have to work with knows nothing about packets, only raw &[u8] (a slice of bytes).

So, our task is to turn a &[u8] into something else. But what? In other implementations I've seen it's common to think of all 5 packet types as variations on a theme. We could follow suit, doing the Rusty thing and define an enum.

enum Packet {
+    Rrq,
+    Wrq,
+    Data,
+    Ack,
+    Error,
+}
+

I might have liked my Go implemenation to look like this. If Go even had enums! 😒

This design choice has an unintended consequence though. As mentioned earlier, RRQ and WRQ only really matter on initial request. The remainder of the transfer isn't concerned with those variants. Even so, Rust's (appreciated) insistence on exhaustively matching patterns would make us write code like this.

match packet(&data)? {
+    Packet::Data => handle_data(),
+    Packet::Ack => handle_ack(),
+    Packet::Error => handle_error(),
+    _ => unreachable!("Didn't we already handle this?"),
+}
+

Also, you might be tempted to use unreachable! for such code, but it actually is reachable. An ill-behaved client could send a request packet mid-connection and this design would allow it!

Instead, what if we were more strict with our types and split the initial Request from the rest of the Transfer?

Requests

Before we can talk about a Request we should talk about its parts. When we talked about packet types we saw that RRQ and WRQ only differed by opcode and the rest of the packet was the same, a filename and a mode.

A Mode is another natural enum, but for our purposes we'll only bother with the Octet variant for now.

pub enum Mode {
+    // Netascii, for completeness.
+    Octet,
+    // Mail, if only to gracefully send an ERROR.
+}
+

As an added convenience later on we'll add a Display impl for Mode so we can convert it to a string.

impl Display for Mode {
+    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
+        match self {
+            Self::Octet => write!(f, "octet"),
+        }
+    }
+}
+

A Mode combined with a filename make up the "inner type", which I'll call a Payload for lack of a better term. I've taken some liberties by declaring filename a PathBuf, which we'll touch on briefly in the parsing section.

pub struct Payload {
+    pub filename: PathBuf,
+    pub mode: Mode,
+}
+

Now we can define a Request as an enum where each variant has a Payload.

pub enum Request {
+    Read(Payload),
+    Write(Payload),
+}
+

Transfers

Request takes care of RRQ and WRQ packets, so a Transfer enum needs to take care of the remaining DATA, ACK, & ERROR packets. Transfers are the meat of the protocol and more complex than requests. Let's break down each variant.

Data

The Data variant needs to contain the block number, which is 2 bytes and fits neatly into a u16. It also needs to contain the raw bytes of the data. There are many ways to represent this, including using a Vec<u8> or a bytes::Bytes. However, I think the most straightforward is as a &[u8] even though it introduces a lifetime.

Ack

The Ack packet is the simplest and only needs a block number. We'll use a solitary u16 for that.

Error

The Error variant warrants more consideration because of the well-defined error codes. I abhor magic numbers in my code, so I'll prefer to define another enum called ErrorCode for those. For the message a String should suffice.

ErrorCode

Defining an ErrorCode involves more boilerplate than I'd like, so I'll show three variants and leave the remainder as an exercise for the reader.

#[derive(Copy, Clone)]
+pub enum ErrorCode {
+    Undefined,
+    FileNotFound,
+    // ...
+    Unknown(u16),
+}
+

The Undefined variant is, humorously, defined, but the Unknown variant I've added here is not part of RFC 1350. It merely acts as a catch-all for the remaining error space. Conveniently, Rust enums allow variants to contain other data.

Because of this Unknown variant I didn't opt for a C-style enum like

enum ErrorCode {
+    Undefined = 0,
+    FileNotFound = 1,
+    // ...
+}
+

so we can't cast an ErrorCode to a u16.

// This explodes! 💣💥
+let code = ErrorCode::Unknown(42) as u16;
+

However, we can add From implementations. One to convert from an ErrorCode to a u16.

impl From<ErrorCode> for u16 {
+    fn from(error_code: ErrorCode) -> Self {
+        match error_code {
+            ErrorCode::Undefined => 0,
+            ErrorCode::FileNotFound => 1,
+            // ...
+            ErrorCode::Unknown(n) => n,
+        }
+    }
+}
+

And another to convert from a u16 to an ErrorCode.


+impl From<u16> for ErrorCode {
+    fn from(code: u16) -> Self {
+        match code {
+            0 => Self::Undefined,
+            1 => Self::FileNotFound,
+            // ...
+            n => Self::Unknown(n),
+        }
+    }        
+}
+

That way we still have a convenient method for conversions.

let code = 42;
+let error: ErrorCode = code.into();
+assert_eq!(error, ErrorCode::Unknown(42));
+

Putting It All Together

With each variant considered, we arrive at an enum that looks like this.

pub enum Transfer<'a> {
+    Data { block: u16, data: &'a [u8] },
+    Ack { block: u16 },
+    Error { code: ErrorCode, message: String },
+}
+

I could have defined structs to hold the inner data for each variant like I did with Payload earlier, but because none of the variants had the same shape I felt less inclined to do so.

Parsing

Now that we have a high-level type design to match the low-level network representation we can bridge the two by parsing. There are as many ways to shave this Yacc as there were enums in our packet types, but I settled on the nom library.

What Is nom?

nom's own readme does a better job of describing itself than I ever could, so I'll just let it do the talking.

nom is a parser combinators library written in Rust. Its goal is to provide tools to build safe parsers without compromising the speed or memory consumption. To that end, it uses extensively Rust's strong typing and memory safety to produce fast and correct parsers, and provides functions, macros and traits to abstract most of the error prone plumbing.

That sounds good and all, but what the heck is a parser combinator? Once again, nom has a great description which I encourage you to read. The gist is that, unlike other approaches, parser combinators encourage you to give your parsing a functional flair. You construct small functions to parse the simplest patterns and gradually compose them to handle more complex inputs.

nom has an extra advantage in that it is byte-oriented. It uses &[u8] as its base type, which makes it convenient for parsing network protocols. This is exactly the type we receive off the wire.

Defining Combinators

It's finally time to define some combinators and do some parsing! Even if you're familiar with Rust, nom combinators might look more like Greek to you. I'll explain the first one in depth to show how they work and then explain only the more confusing parts as we go along. First, a small primer.

nom combinators return IResult, a type alias for a Result that's generic over three types instead of the usual two.

pub type IResult<I, O, E = Error<I>> = Result<(I, O), Err<E>>;
+

These types are the input typeI, the output type O, and the error type E (usually a nom error). I understand this type to mean that I will be parsed to produce O and any leftover I as long as no error E happens. For our purposes I is &[u8] and we'll have a couple different O types.

Null Strings

null references are famously a "billion dollar mistake" and I can't say I like null any better in this protocol.

Like all other strings, it is terminated with a zero byte.
— RFC 1350, smugly

Or, you know, just tell me how long the darn string is. You're the one who put it in the packet... Yes, I know why you did it, but I don't have to like it. 🤪

Mercifully, the nom toolkit has everything we need to slay this beast.

fn null_str(input: &[u8]) -> IResult<&[u8], &str> {
+    map_res(
+        tuple((take_till(|b| b == b'\x00'), tag(b"\x00"))),
+        |(s, _)| std::str::from_utf8(s),
+    )(input)
+}
+

Let's work inside out to understand what null_str is doing.

  1. take_till accepts a function (here we use a closure with b for each byte) and collects up bytes from the input until one of the bytes matches the null byte, b'\x00'. This gets us a &[u8] up until, but not including, our zero byte.

  2. tag here just recognizes the zero byte for completeness, but we'll discard it later.

  3. tuple applies a tuple of parsers one by one and returns their results as a tuple.

  4. map_res applies a function returning a Result over the result of a parser. This gives us a nice way to call a fallible function on the results of earlier parsing, take_till and tag in this case.

  5. std::str::from_utf8, the fallible function inside our outermost closure, converts our &[u8] (now sans zero byte) into a Rust &str, which is not terminated with a zero byte.

  6. IResult<&[u8], &str> ties it all together at the end in null_str's return signature returning any unmatched &[u8] and a &str if successful.

It's important to note that I'm taking another huge liberty here by converting these bytes to a Rust string at all. Rust strings are guaranteed to be valid UTF-8. TFTP predates UTF-8, so the protocol did not specify that these strings should be Unicode. Later, I might look into an OsString, but for now non-Unicode strings will cause failures.

Please, only send me UTF-8 strings.
— Me, wearily

Request Combinators

Since Request only concerns itself with the first two packet types, RRQ and WRQ we can start parsing by matching only those opcodes. For convenience I used the num-derive crate to create a RequestOpCode enum so I could use FromPrimitive::from_u16.

The request_opcode combinator uses map_opt and be_u16 combinators to parse a u16 out of the input and pass it to from_u16 to construct a RequestOpCode.

use num_derive::FromPrimitive;
+use num_traits::FromPrimitive;
+
+#[derive(FromPrimitive)]
+enum RequestOpCode {
+    Rrq = 1,
+    Wrq = 2,
+}
+
+fn request_opcode(input: &[u8]) -> IResult<&[u8], RequestOpCode> {
+    map_opt(be_u16, RequestOpCode::from_u16)(input)
+}
+

To parse a Mode we map the result of tag_no_case onto our Mode constructor. This function would need to be slightly more complex if we were supporting more than octet mode right now, but not by much.

fn mode(input: &[u8]) -> IResult<&[u8], Mode> {
+    map(tag_no_case(b"octet\x00"), |_| Mode::Octet)(input)
+}
+

For a Payload we can use tuple with our mode combinator and null_str to match our filename. We then use a provided Into impl to convert our filename &str to a PathBuf.

fn payload(input: &[u8]) -> IResult<&[u8], Payload> {
+    let (input, (filename, mode)) = tuple((null_str, mode))(input)?;
+    Ok((
+        input,
+        Payload {
+            filename: filename.into(),
+            mode,
+        },
+    ))
+}
+

Finally, we reach the top level of parsing and put all the rest together. The request function is not, itself, a combinator, which is why you see the Finish::finish calls here. We use all_consuming to ensure no input remains after parsing with payload and map the result to our respective Read and Write variants. We also hide nom errors inside a custom error.

pub fn request(input: &[u8]) -> Result<Request, ParsePacketError> {
+    let iresult = match request_opcode(input).finish()? {
+        (input, RequestOpCode::Rrq) => map(all_consuming(payload), Request::Read)(input),
+        (input, RequestOpCode::Wrq) => map(all_consuming(payload), Request::Write)(input),
+    };
+
+    iresult
+        .finish()
+        .map(|(_, request)| request)
+        .map_err(ParsePacketError::from)
+}
+

With our combinators in order we can add a Request::deserialize method to our enum to hide the implementation details, making it much easier to switch parsing logic later if we want.

impl Request {
+    pub fn deserialize(bytes: &[u8]) -> Result<Self, ParsePacketError> {
+        parse::request(bytes)
+    }
+}
+

Parsing Failures

You might have wondered where that ParsePacketError came from. It's right here. I used the thiserror crate because it's invaluable when crafting custom errors. Thanks, @dtolnay!

#[derive(Debug, PartialEq, thiserror::Error)]
+#[error("Error parsing packet")]
+pub struct ParsePacketError(nom::error::Error<Vec<u8>>);
+
+// Custom From impl because thiserror's #[from] can't tranlate this for us.
+impl From<nom::error::<&[u8]>> for ParsePacketError {
+    fn from(err: nom::error::Error<&[u8]>) -> Self {
+        ParsePacketError(nom::error::Error::new(err.input.to_vec(), err.code))
+    }
+}
+

You might also wonder why I converted from the original nom::error::Error<&[u8]> to nom::error::Error<Vec<u8>>. Apparently std::error::Error::source() requires errors to be dyn Error + 'static, so non-static lifetimes aren't allowed if you want to provide a backtrace, which I might like to do at some point. Also, it just seems reasonable for an Error type to own its data.

While we were careful to split up our Request and Transfer types I didn't see a whole lot of benefit in having separate error types, so I reused ParsePacketError for Transfer as well.

Transfer Combinators

The Transfer combinators are very similar to what we did for Request. The opcode handling is basically the same, but with different numeric values so we can't accidentally parse any other opcodes.

use num_derive::FromPrimitive;
+use num_traits::FromPrimitive;
+
+#[derive(FromPrimitive)]
+enum TransferOpCode {
+    Data = 3,
+    Ack = 4,
+    Error = 5,
+}
+
+fn transfer_opcode(input: &[u8]) -> IResult<&[u8], TransferOpCode> {
+    map_opt(be_u16, TransferOpCode::from_u16)(input)
+}
+

For Data we just peel off the u16 block number and then retain the rest as the original &[u8]. The type alias here isn't necessary, but I like to do small things like this for organizational purposes.

type Data<'a> = (u16, &'a [u8]);
+
+fn data(input: &[u8]) -> IResult<&[u8], Data> {
+    tuple((be_u16, rest))(input)
+}
+

Ack is, once again, the simplest. Just a named wrapper around be_u16.

type Ack = u16;
+
+fn ack(input: &[u8]) -> IResult<&[u8], Ack> {
+    be_u16(input)
+}
+

The Error variant is nearly as simple, but we need a call to Result::map to call Into impls and convert code from u16 to ErrorCode and message from &str to String.

type Error = (ErrorCode, String);
+
+fn error(input: &[u8]) -> IResult<&[u8], Error> {
+    tuple((be_u16, null_str))(input)
+        .map(|(input, (code, message))| (input, (code.into(), message.into())))
+}
+

When we put it all these combinators together in a transfer function it looks more complex than our earlier request function. That's only because there are more variants and my choice to use anonymous struct variants instead of tuple structs means there's no easy constructor, so we map over a closure. Otherwise the idea is the same as before.

pub fn transfer(input: &[u8]) -> Result<Transfer, ParsePacketError> {
+    let iresult = match opcode(input).finish()? {
+        (input, TransferOpCode::Data) => map(all_consuming(data), |(block, data)| {
+            Transfer::Data { block, data }
+        })(input),
+        (input, TransferOpCode::Ack) => {
+            map(all_consuming(ack), |block| Transfer::Ack { block })(input)
+        }
+        (input, TransferOpCode::Error) => map(all_consuming(error), |(code, message)| {
+            Transfer::Error { code, message }
+        })(input),
+    };
+
+    iresult
+        .finish()
+        .map(|(_, transfer)| transfer)
+        .map_err(ParsePacketError::from)
+}
+

Just like with Request we create a Transfer::deserialize method to hide these parsing details from the rest of our code.

impl<'a> Transfer<'a> {
+    pub fn deserialize(bytes: &'a [u8]) -> Result<Self, ParsePacketError> {
+        parse::transfer(bytes)
+    }
+}
+

Serialization

We can now read bytes into packets, which is handy, but astute readers will have noticed that you need to do the reverse if you're going to have a full TFTP conversation. Luckily, this serialization is (mostly) infallible, so there's less to explain.

I used BytesMut because I was already using the bytes crate for the extension methods on the BufMut trait like put_slice. Plus, this way I avoid an accidental panic if I pass a &mut [u8] and forget to size it appropriately.

Serializing Request

Serializing a Request packet is deceptively straightfoward. We use a match expression to pull our Payload out of the request and associate with a RequestOpCode. Then we just serialize the opcode as a u16 with put_u16. The filename and mode we serialize as null-terminated strings using a combo of put_slice and put_u8.

impl Request {
+    pub fn serialize(&self, buffer: &mut BytesMut) {
+        let (opcode, payload) = match self {
+            Request::Read(payload) => (RequestOpCode::Rrq, payload),
+            Request::Write(payload) => (RequestOpCode::Wrq, payload),
+        };
+
+        buffer.put_u16(opcode as u16);
+        buffer.put_slice(payload.filename.to_string_lossy().as_bytes());
+        buffer.put_u8(0x0);
+        buffer.put_slice(payload.mode.to_string().as_bytes());
+        buffer.put_u8(0x0);
+    }
+}
+

Converting our mode with as_bytes through a to_string is possible thanks to our earlier Display impl for Mode. The filename conversion to bytes through PathBuf's to_string_lossy might reasonably raise some eyebrows. Unlike strings a Rust path is not guaranteed to be UTF-8, so any non-Unicode characters will be replaced with � (U+FFFD). For now, given my earlier Unicode decision I'm comfortable with this, but a more robust method is desirable.

Serializing Transfer

Serializing a Transfer packet is more straightforward.

impl Transfer<'_> {
+    pub fn serialize(&self, buffer: &mut BytesMut) {
+        match *self {
+            Self::Data { block, data } => {
+                buffer.put_u16(TransferOpCode::Data as u16);
+                buffer.put_u16(block);
+                buffer.put_slice(data);
+            }
+            Self::Ack { block } => {
+                buffer.put_u16(TransferOpCode::Ack as u16);
+                buffer.put_u16(block);
+            }
+            Self::Error { code, ref message } => {
+                buffer.put_u16(TransferOpCode::Error as u16);
+                buffer.put_u16(code.into());
+                buffer.put_slice(message.as_bytes());
+                buffer.put_u8(0x0);
+            }
+        }
+    }
+}
+

As before, with each variant we serialize a u16 for the TransferOpCode and then do variant-specific serialization.

That's it! Now we can read and write structured data to and from raw bytes! 🎉

Tests

A post on parsing wouldn't be complete without some tests showing that our code works as expected. First, we'll use the marvelous test-case crate to bang out a few negative tests on things we expect to be errors.

#[test_case(b"\x00" ; "too small")]
+#[test_case(b"\x00\x00foobar.txt\x00octet\x00" ; "too low")]
+#[test_case(b"\x00\x03foobar.txt\x00octet\x00" ; "too high")]
+fn invalid_request(input: &[u8]) {
+    let actual = Request::deserialize(input);
+    // We don't care about the nom details, so ignore them with ..
+    assert!(matches!(actual, Err(ParsePacketError(..))));
+}
+

And, for good measure, we'll show that we can round-trip an RRQ packet from raw bytes with a stop at a proper enum in between.

#[test]
+fn roundtrip_rrq() -> Result<(), ParsePacketError> {
+    let before = b"\x00\x01foobar.txt\x00octet\x00";
+    let expected = Request::Read(Payload {
+        filename: "foobar.txt".into(),
+        mode: Mode::Octet,
+    });
+    
+    let packet = Request::deserialize(before)?;
+    // Use an under-capacity buffer to test panics.
+    let mut after = BytesMut::with_capacity(4);
+    packet.serialize(&mut after);
+    
+    assert_eq!(packet, expected);
+    assert_eq!(&before[..], after);
+}
+

Unless you want to copy/paste all this code you'll have to trust me that the tests pass. 😉 Don't worry, I've written many more tests, but this is a blog post, not a test suite, so I'll spare you the details.

Acknowledgements

Wow. You actually read all the way to the end. Congrats, and more importantly, thank you! 🙇‍♂️

All of the work above is part of a personal project I chip away at in my spare time, but I don't do it alone. I owe a huge debt of gratitude to my friend & Rust mentor, Zefira, who has spent countless hours letting me pick her brain on every minute detail of this TFTP code. I could not have written this blog post without her!

I also need to thank Yiannis M (@oblique) for their work on the async-tftp-rs crate, from which I have borrowed liberally and learned a great deal. You may recognize some combinators if you dive into that code.

Finally, I can't thank my wife enough helping me edit this. There are many fewer mistakes as a result.

The source code for the rest of the project is not currently public, but when I'm more confident in it I'll definitely share more details. Meanwhile, I welcome any and all suggestions on how to make what I've written here more efficient and safe.

\ No newline at end of file diff --git a/posts/parsing-tftp-in-rust/rrq.svg b/posts/parsing-tftp-in-rust/rrq.svg new file mode 100644 index 0000000..af8c4e9 --- /dev/null +++ b/posts/parsing-tftp-in-rust/rrq.svg @@ -0,0 +1 @@ +ClientServerloopRRQDATAACKClientServer diff --git a/posts/parsing-tftp-in-rust/wrq.svg b/posts/parsing-tftp-in-rust/wrq.svg new file mode 100644 index 0000000..e813909 --- /dev/null +++ b/posts/parsing-tftp-in-rust/wrq.svg @@ -0,0 +1 @@ +ClientServerloopWRQACKDATAClientServer diff --git a/posts/resolving-a-dns-issue/after.svg b/posts/resolving-a-dns-issue/after.svg new file mode 100644 index 0000000..3ab58f8 --- /dev/null +++ b/posts/resolving-a-dns-issue/after.svg @@ -0,0 +1,4 @@ + +HTTP GET by IP vs. HTTP GET by Hostname000.010.010.020.020.030.030.040.040.050.050.060.060.070.070.080.080.090.090.10.10.110.110.120.120.130.130.140.140.150.150.160.162018-03-26T05:47:44Z2018-03-26T05:50:35Z2018-03-26T05:53:27Z2018-03-26T05:56:18Z2018-03-26T05:59:09Z2018-03-26T06:02:00Z2018-03-26T06:04:51Z2018-03-26T06:07:42Z2018-03-26T06:10:33Z2018-03-26T06:13:24Z2018-03-26T06:16:15ZHTTP GET by IP vs. HTTP GET by HostnameSeconds0.029.647633683574373399.30769230769232018-03-26T05:47:44Z0.0212.357643145252569399.30769230769230.0215.06765260693076399.30769230769230.0117.777662068608958427.19230769230770.0220.48767153028715399.30769230769230.0123.197680991965345427.19230769230770.0225.907690453643543399.30769230769230.0128.617699915321737427.19230769230770.0131.32770937699993427.19230769230770.0134.037718838678124427.19230769230770.0436.74772830035632343.538461538461550.0239.45773776203451399.30769230769230.0142.16774722371271427.19230769230770.0144.8777566853909427.19230769230770.0147.5877661470691427.19230769230770.0350.297775608747294371.42307692307690.0153.007785070425484427.19230769230770.0955.71779453210368204.115384615384642018-03-26T05:50:35Z0.0158.42780399378187427.19230769230770.0261.13781345546007399.30769230769230.0163.84782291713826427.1923076923077066.55783237881646455.07692307692310.0169.26784184049464427.19230769230770.0171.97785130217284427.19230769230770.0174.68786076385103427.19230769230770.0177.39787022552925427.19230769230770.0180.10787968720742427.19230769230770.0182.81788914888563427.19230769230770.0285.52789861056381399.30769230769230.0188.23790807224202427.19230769230770.0190.94791753392019427.19230769230770.0293.6579269955984399.30769230769230.0396.3679364572766371.42307692307690.0299.07794591895478399.30769230769230.01101.78795538063298427.19230769230772018-03-26T05:53:27Z0.01104.49796484231118427.19230769230770.02107.20797430398937399.30769230769230.02109.91798376566756399.30769230769230.01112.62799322734575427.19230769230770.01115.33800268902397427.19230769230770.01118.04801215070214427.19230769230770.01120.75802161238036427.19230769230770.01123.46803107405853427.19230769230770.01126.17804053573674427.19230769230770.01128.88804999741495427.19230769230770.01131.59805945909315427.19230769230770.01134.30806892077132427.19230769230770.01137.01807838244952427.19230769230770.01139.7280878441277427.19230769230770.03142.4380973058059371.42307692307690.01145.1481067674841427.19230769230770.01147.85811622916228427.19230769230772018-03-26T05:56:18Z0.01150.5681256908405427.19230769230770.01153.2781351525187427.19230769230770.01155.9881446141969427.19230769230770.01158.69815407587507427.19230769230770.01161.40816353755326427.19230769230770.01164.11817299923146427.19230769230770.01166.8281824609097427.19230769230770.02169.53819192258786399.30769230769230.02172.24820138426603399.30769230769230.01174.95821084594425427.19230769230770.02177.66822030762245399.30769230769230.01180.37822976930065427.19230769230770.01183.08823923097881427.19230769230770.01185.798248692657427.19230769230770.02188.50825815433524399.30769230769230.02191.2182676160134399.30769230769230.01193.9282770776916427.19230769230772018-03-26T05:59:09Z0.01196.6382865393698427.19230769230770.01199.348296001048427.19230769230770.01202.05830546272617427.19230769230770.01204.7683149244044427.19230769230770.01207.4783243860826427.19230769230770.01210.18833384776076427.19230769230770.02212.89834330943896399.30769230769230.01215.60835277111715427.19230769230770.01218.31836223279538427.19230769230770.01221.02837169447355427.19230769230770.01223.73838115615172427.19230769230770.08226.44839061782994232.000000000000060229.15840007950814455.07692307692310.02231.86840954118634399.30769230769230.01234.5784190028645427.19230769230770.01237.2884284645427427.19230769230770.01239.9984379262209427.19230769230772018-03-26T06:02:00Z0.02242.7084473878991399.30769230769230.01245.4184568495773427.19230769230770.02248.1284663112555399.30769230769230.01250.83847577293372427.19230769230770.01253.5484852346119427.19230769230770.01256.2584946962901427.19230769230770.01258.9685041579683427.19230769230770.02261.6785136196465399.30769230769230.01264.38852308132465427.19230769230770.01267.0985325430029427.19230769230770.01269.80854200468104427.19230769230770.01272.51855146635927427.19230769230770.01275.22856092803744427.19230769230770.01277.9385703897156427.19230769230770.02280.64857985139383399.30769230769230.02283.35858931307206399.30769230769230.01286.0685987747502427.19230769230772018-03-26T06:04:51Z0.01288.7786082364284427.19230769230770.01291.48861769810657427.19230769230770.01294.1986271597848427.19230769230770.01296.908636621463427.19230769230770.01299.61864608314124427.19230769230770.02302.3286555448194399.30769230769230.02305.0386650064976399.30769230769230.01307.74867446817575427.19230769230770.02310.4586839298539399.30769230769230.02313.1686933915322399.30769230769230.02315.87870285321037399.30769230769230.01318.58871231488854427.19230769230770.01321.29872177656677427.19230769230770.02324.00873123824493399.30769230769230.01326.71874069992316427.19230769230770.01329.42875016160133427.19230769230770.01332.1387596232795427.19230769230772018-03-26T06:07:42Z0.01334.8487690849577427.19230769230770.01337.5587785466359427.19230769230770.01340.2687880083141427.19230769230770.01342.97879746999234427.19230769230770.02345.6888069316705399.30769230769230.01348.39881639334874427.19230769230770.02351.10882585502685399.30769230769230.01353.8188353167051427.19230769230770.01356.52884477838325427.19230769230770.01359.23885424006147427.19230769230770.02361.9488637017397399.30769230769230.02364.65887316341787399.30769230769230.02367.36888262509603399.30769230769230.01370.07889208677426427.19230769230770.01372.78890154845243427.19230769230770.01375.49891101013066427.19230769230770.01378.2089204718088427.19230769230772018-03-26T06:10:33Z0.01380.918929933487427.19230769230770.01383.6289393951652427.19230769230770.01386.3389488568434427.19230769230770.01389.04895831852167427.19230769230770.07391.75896778019984259.884615384615360.04394.46897724187795343.538461538461550.01397.1789867035562427.19230769230770.01399.88899616523435427.19230769230770.01402.59900562691263427.19230769230770.01405.3090150885908427.19230769230770.01408.01902455026897427.19230769230770.01410.7290340119472427.19230769230770.01413.43904347362536427.19230769230770.01416.14905293530353427.19230769230770.01418.85906239698176427.19230769230770.01421.5690718586599427.19230769230770.03424.27908132033815371.42307692307692018-03-26T06:13:24Z0.01426.9890907820163427.19230769230770.01429.69910024369455427.19230769230770.02432.40910970537277399.30769230769230.01435.11911916705094427.19230769230770.01437.8291286287291427.19230769230770.01440.5391380904073427.19230769230770.01443.24914755208545427.19230769230770.01445.9591570137637427.19230769230770.01448.6691664754419427.19230769230770.01451.3791759371201427.19230769230770.01454.0891853987983427.19230769230770.01456.79919486047646427.19230769230770.01459.50920432215463427.19230769230770.01462.21921378383286427.19230769230770.01464.9292232455111427.19230769230770.01467.63923270718925427.19230769230770.01470.3492421688674427.19230769230772018-03-26T06:16:15Z0.02473.05925163054565399.30769230769230.02475.7692610922238399.30769230769230.01478.4792705539021427.19230769230770.02481.1892800155802399.30769230769230.01483.8992894772583427.19230769230770.02486.6092989389366399.30769230769230.01489.3193084006147427.19230769230770.01492.029317862293427.19230769230770.029.647633683574373399.30769230769232018-03-26T05:47:44Z0.0212.357643145252569399.30769230769230.0415.06765260693076343.538461538461550.0817.777662068608958232.000000000000060.0320.48767153028715371.42307692307690.0323.197680991965345371.42307692307690.0325.907690453643543371.42307692307690.0528.617699915321737315.65384615384620.0431.32770937699993343.538461538461550.0334.037718838678124371.42307692307690.0436.74772830035632343.538461538461550.0439.45773776203451343.538461538461550.0342.16774722371271371.42307692307690.0344.8777566853909371.42307692307690.0347.5877661470691371.42307692307690.0450.297775608747294343.538461538461550.0253.007785070425484399.30769230769230.0355.71779453210368371.42307692307692018-03-26T05:50:35Z0.0458.42780399378187343.538461538461550.0461.13781345546007343.538461538461550.0363.84782291713826371.42307692307690.0166.55783237881646427.19230769230770.0269.26784184049464399.30769230769230.0171.97785130217284427.19230769230770.0274.68786076385103399.30769230769230.0477.39787022552925343.538461538461550.0180.10787968720742427.19230769230770.0182.81788914888563427.19230769230770.0285.52789861056381399.30769230769230.0488.23790807224202343.538461538461550.0390.94791753392019371.42307692307690.0293.6579269955984399.30769230769230.0696.3679364572766287.769230769230830.1499.0779459189547864.692307692307620.02101.78795538063298399.30769230769232018-03-26T05:53:27Z0.04104.49796484231118343.538461538461550.04107.20797430398937343.538461538461550.05109.91798376566756315.65384615384620.03112.62799322734575371.42307692307690.01115.33800268902397427.19230769230770.01118.04801215070214427.19230769230770.02120.75802161238036399.30769230769230.02123.46803107405853399.30769230769230.02126.17804053573674399.30769230769230.04128.88804999741495343.538461538461550.02131.59805945909315399.30769230769230.02134.30806892077132399.30769230769230.02137.01807838244952399.30769230769230.03139.7280878441277371.42307692307690.04142.4380973058059343.538461538461550.02145.1481067674841399.30769230769230.03147.85811622916228371.42307692307692018-03-26T05:56:18Z0.02150.5681256908405399.30769230769230.04153.2781351525187343.538461538461550.02155.9881446141969399.30769230769230.01158.69815407587507427.19230769230770.01161.40816353755326427.19230769230770.02164.11817299923146399.30769230769230.02166.8281824609097399.30769230769230.02169.53819192258786399.30769230769230.04172.24820138426603343.538461538461550.01174.95821084594425427.19230769230770.04177.66822030762245343.538461538461550.05180.37822976930065315.65384615384620.03183.08823923097881371.42307692307690.02185.798248692657399.30769230769230.14188.5082581543352464.692307692307620.03191.2182676160134371.42307692307690.04193.9282770776916343.538461538461552018-03-26T05:59:09Z0.03196.6382865393698371.42307692307690.03199.348296001048371.42307692307690.03202.05830546272617371.42307692307690.03204.7683149244044371.42307692307690.04207.4783243860826343.538461538461550.05210.18833384776076315.65384615384620.02212.89834330943896399.30769230769230.04215.60835277111715343.538461538461550.04218.31836223279538343.538461538461550.03221.02837169447355371.42307692307690.02223.73838115615172399.30769230769230.05226.44839061782994315.65384615384620.03229.15840007950814371.42307692307690.04231.86840954118634343.538461538461550.02234.5784190028645399.30769230769230.05237.2884284645427315.65384615384620.03239.9984379262209371.42307692307692018-03-26T06:02:00Z0.03242.7084473878991371.42307692307690.04245.4184568495773343.538461538461550.01248.1284663112555427.19230769230770.02250.83847577293372399.30769230769230.03253.5484852346119371.42307692307690.03256.2584946962901371.42307692307690.03258.9685041579683371.42307692307690.01261.6785136196465427.19230769230770.02264.38852308132465399.30769230769230.04267.0985325430029343.538461538461550.03269.80854200468104371.42307692307690.03272.51855146635927371.42307692307690.04275.22856092803744343.538461538461550.04277.9385703897156343.538461538461550.02280.64857985139383399.30769230769230.04283.35858931307206343.538461538461550.02286.0685987747502399.30769230769232018-03-26T06:04:51Z0.04288.7786082364284343.538461538461550.04291.48861769810657343.538461538461550.03294.1986271597848371.42307692307690.04296.908636621463343.538461538461550.03299.61864608314124371.42307692307690.04302.3286555448194343.538461538461550.03305.0386650064976371.42307692307690.02307.74867446817575399.30769230769230.03310.4586839298539371.42307692307690.02313.1686933915322399.30769230769230.02315.87870285321037399.30769230769230.03318.58871231488854371.42307692307690.04321.29872177656677343.538461538461550.05324.00873123824493315.65384615384620.02326.71874069992316399.30769230769230.01329.42875016160133427.19230769230770.01332.1387596232795427.19230769230772018-03-26T06:07:42Z0.03334.8487690849577371.42307692307690.04337.5587785466359343.538461538461550.03340.2687880083141371.42307692307690.05342.97879746999234315.65384615384620.04345.6888069316705343.538461538461550.03348.39881639334874371.42307692307690.04351.10882585502685343.538461538461550.03353.8188353167051371.42307692307690.03356.52884477838325371.42307692307690.03359.23885424006147371.42307692307690.02361.9488637017397399.30769230769230.04364.65887316341787343.538461538461550.03367.36888262509603371.42307692307690.02370.07889208677426399.30769230769230.04372.78890154845243343.538461538461550.02375.49891101013066399.30769230769230.02378.2089204718088399.30769230769232018-03-26T06:10:33Z0.02380.918929933487399.30769230769230.04383.6289393951652343.538461538461550.01386.3389488568434427.19230769230770.04389.04895831852167343.538461538461550.16391.758967780199848.9230769230769620.05394.46897724187795315.65384615384620.02397.1789867035562399.30769230769230.04399.88899616523435343.538461538461550.04402.59900562691263343.538461538461550.02405.3090150885908399.30769230769230.04408.01902455026897343.538461538461550.04410.7290340119472343.538461538461550.03413.43904347362536371.42307692307690.03416.14905293530353371.42307692307690.04418.85906239698176343.538461538461550.04421.5690718586599343.538461538461550.02424.27908132033815399.30769230769232018-03-26T06:13:24Z0.02426.9890907820163399.30769230769230.03429.69910024369455371.42307692307690.03432.40910970537277371.42307692307690.02435.11911916705094399.30769230769230.03437.8291286287291371.42307692307690.04440.5391380904073343.538461538461550.06443.24914755208545287.769230769230830.02445.9591570137637399.30769230769230.01448.6691664754419427.19230769230770.02451.3791759371201399.30769230769230.02454.0891853987983399.30769230769230.02456.79919486047646399.30769230769230.01459.50920432215463427.19230769230770.02462.21921378383286399.30769230769230.04464.9292232455111343.538461538461550.02467.63923270718925399.30769230769230.05470.3492421688674315.65384615384622018-03-26T06:16:15Z0.05473.05925163054565315.65384615384620.02475.7692610922238399.30769230769230.04478.4792705539021343.538461538461550.01481.1892800155802427.19230769230770.04483.8992894772583343.538461538461550.03486.6092989389366371.42307692307690.01489.3193084006147427.19230769230770.03492.029317862293371.4230769230769By IPBy Hostname \ No newline at end of file diff --git a/posts/resolving-a-dns-issue/before.svg b/posts/resolving-a-dns-issue/before.svg new file mode 100644 index 0000000..54536ef --- /dev/null +++ b/posts/resolving-a-dns-issue/before.svg @@ -0,0 +1,4 @@ + +HTTP GET by IP vs. HTTP GET by Hostname0011223344552018-03-26T03:37:29Z2018-03-26T03:40:26Z2018-03-26T03:43:34Z2018-03-26T03:46:31Z2018-03-26T03:49:28Z2018-03-26T03:52:24Z2018-03-26T03:55:32Z2018-03-26T03:58:31Z2018-03-26T04:01:33Z2018-03-26T04:04:37Z2018-03-26T04:07:39ZHTTP GET by IP vs. HTTP GET by HostnameSeconds0.019.916864452805141454.27159122465982018-03-26T03:37:29Z0.0112.702500535053776454.27159122465980.0215.48813661730241453.466259372396560.0118.27377269955105454.27159122465980.0121.059408781799686454.27159122465980.0223.84504486404832453.466259372396560.0626.630680946296955450.24493196334350.0129.41631702854559454.27159122465980.0332.20195311079423452.660927520133270.0234.98758919304286453.466259372396560.0137.773225275291495454.27159122465980.0140.55886135754013454.27159122465980.0143.34449743978877454.27159122465980.0246.1301335220374453.466259372396560.0148.915769604286034454.27159122465980.0251.70140568653467453.466259372396560.0154.48704176878331454.27159122465980.0257.27267785103194453.466259372396562018-03-26T03:40:26Z0.0160.05831393328057454.27159122465980.0162.84395001552921454.27159122465980.0265.62958609777785453.466259372396560.0168.41522218002649454.27159122465980.0171.20085826227512454.27159122465980.0173.98649434452376454.27159122465980.0176.77213042677238454.27159122465980.0279.55776650902102453.466259372396560.0282.34340259126965453.466259372396560.0185.1290386735183454.27159122465980.0287.91467475576692453.466259372396560.0290.70031083801557453.466259372396560.0193.4859469202642454.27159122465980.0196.27158300251284454.27159122465980.0199.05721908476147454.27159122465980.01101.8428551670101454.27159122465980.02104.62849124925873453.466259372396562018-03-26T03:43:34Z0.02107.41412733150737453.466259372396560.01110.199763413756454.27159122465980.01112.98539949600465454.27159122465980.01115.77103557825328454.27159122465980.01118.55667166050192454.27159122465980.01121.34230774275055454.27159122465980.01124.1279438249992454.27159122465980.01126.91357990724782454.27159122465980.01129.69921598949648454.27159122465980.01132.4848520717451454.27159122465980.01135.27048815399377454.27159122465980.02138.0561242362424453.466259372396560.01140.841760318491454.27159122465980.01143.62739640073963454.27159122465980.01146.41303248298829454.27159122465980.01149.19866856523691454.27159122465980.01151.98430464748554454.27159122465982018-03-26T03:46:31Z0.01154.76994072973417454.27159122465980.01157.55557681198283454.27159122465980.02160.34121289423146453.466259372396560.01163.1268489764801454.27159122465980.01165.91248505872872454.27159122465980.01168.69812114097738454.27159122465980.01171.483757223226454.27159122465980.01174.26939330547464454.27159122465980.01177.05502938772327454.27159122465980.02179.84066546997192453.466259372396560.01182.62630155222055454.27159122465980.02185.41193763446918453.466259372396560.01188.19757371671778454.27159122465980.01190.98320979896644454.27159122465980.01193.76884588121507454.27159122465980.01196.5544819634637454.27159122465980.01199.34011804571233454.27159122465982018-03-26T03:49:28Z0.02202.125754127961453.466259372396560.01204.91139021020962454.27159122465980.01207.69702629245825454.27159122465980.01210.48266237470688454.27159122465980.03213.26829845695553452.660927520133270.02216.05393453920416453.466259372396560.01218.8395706214528454.27159122465980.01221.62520670370142454.27159122465980.02224.41084278595008453.466259372396560.01227.1964788681987454.27159122465980.01229.98211495044734454.27159122465980.01232.76775103269597454.27159122465980.01235.55338711494463454.27159122465980.01238.33902319719323454.27159122465980.01241.12465927944186454.27159122465980.01243.9102953616905454.27159122465980.01246.69593144393917454.27159122465982018-03-26T03:52:24Z0.01249.48156752618777454.27159122465980.01252.2672036084364454.27159122465980.01255.05283969068506454.27159122465980.02257.8384757729337453.466259372396560.01260.6241118551824454.27159122465980.01263.40974793743095454.27159122465980.01266.1953840196796454.27159122465980.01268.98102010192827454.27159122465980.01271.76665618417684454.27159122465980.01274.55229226642547454.27159122465980.01277.3379283486741454.27159122465980.01280.1235644309228454.27159122465980.01282.9092005131714454.27159122465980285.69483659542004455.07692307692310.01288.4804726776687454.27159122465980.02291.26610875991736453.466259372396560.01294.05174484216593454.27159122465982018-03-26T03:55:32Z0.01296.8373809244146454.27159122465980.02299.6230170066632453.466259372396560.01302.4086530889119454.27159122465980.01305.1942891711605454.27159122465980.02307.9799252534092453.466259372396560.02310.76556133565776453.466259372396560.01313.5511974179064454.27159122465980.02316.336833500155453.466259372396560.02319.12246958240365453.466259372396560.01321.9081056646523454.27159122465980.02324.6937417469009453.466259372396560.02327.4793778291496453.466259372396560.01330.2650139113982454.27159122465980.01333.05064999364686454.27159122465980.02335.8362860758955453.466259372396560.02338.6219221581441453.466259372396560.01341.40755824039275454.27159122465982018-03-26T03:58:31Z0.01344.1931943226414454.27159122465980.02346.97883040489453.466259372396560.02349.7644664871387453.466259372396560.01352.5501025693873454.27159122465980.01355.33573865163595454.27159122465980.01358.1213747338846454.27159122465980.02360.90701081613315453.466259372396560.01363.69264689838184454.27159122465980.01366.4782829806304454.27159122465980.01369.2639190628791454.27159122465980.01372.0495551451277454.27159122465980.01374.8351912273764454.27159122465980.01377.620827309625454.27159122465980.01380.4064633918737454.27159122465980.01383.19209947412224454.27159122465980.01385.97773555637093454.27159122465980.01388.7633716386195454.27159122465982018-03-26T04:01:33Z0.01391.5490077208682454.27159122465980.01394.3346438031168454.27159122465980.01397.1202798853655454.27159122465980.01399.9059159676141454.27159122465980.02402.69155204986276453.466259372396560.01405.47718813211134454.27159122465980.01408.26282421436454.27159122465980.01411.0484602966086454.27159122465980.01413.8340963788572454.27159122465980.01416.6197324611059454.27159122465980.02419.40536854335454453.466259372396560.01422.19100462560317454.27159122465980.01424.9766407078518454.27159122465980.01427.76227679010043454.27159122465980.01430.54791287234906454.27159122465980.01433.3335489545977454.27159122465980.01436.1191850368463454.27159122465982018-03-26T04:04:37Z0.01438.904821119095454.27159122465980.01441.69045720134363454.27159122465980.01444.47609328359226454.27159122465980.01447.2617293658409454.27159122465980.01450.0473654480895454.27159122465980.01452.83300153033815454.27159122465980.01455.6186376125868454.27159122465980.01458.4042736948354454.27159122465980.01461.18990977708404454.27159122465980.01463.9755458593327454.27159122465980.07466.76118194158136449.439600111080270.01469.54681802383454.27159122465980.03472.33245410607856452.660927520133270.01475.11809018832724454.27159122465980.01477.9037262705759454.27159122465980.01480.6893623528245454.27159122465980.01483.47499843507313454.27159122465982018-03-26T04:07:39Z0.01486.2606345173218454.27159122465980.01489.04627059957045454.27159122465980.02491.8319066818191453.466259372396560.01494.61754276406765454.27159122465980.01497.4031788463163454.27159122465980.01500.188814928565454.27159122465980.01502.9744510108135454.27159122465980.01505.7600870930622454.27159122465980.029.916864452805141453.466259372396562018-03-26T03:37:29Z0.0412.702500535053776451.855595667870030.0215.48813661730241453.466259372396560.0318.27377269955105452.660927520133270.0321.059408781799686452.660927520133275.5323.845044864048329.7284087753401990.0226.630680946296955453.466259372396560.0229.41631702854559453.466259372396560.1432.20195311079423443.802277145237440.0234.98758919304286453.466259372396560.0337.773225275291495452.660927520133270.0140.55886135754013454.27159122465980.0243.34449743978877453.466259372396560.0246.1301335220374453.466259372396560.0348.915769604286034452.660927520133270.0651.70140568653467450.24493196334350.0354.48704176878331452.660927520133270.0357.27267785103194452.660927520133272018-03-26T03:40:26Z0.0360.05831393328057452.660927520133275.5362.843950015529219.7284087753401995.5365.629586097777859.7284087753401995.5368.415222180026499.7284087753401990.0271.20085826227512453.466259372396560.2673.98649434452376434.13829491807830.0376.77213042677238452.660927520133270.2779.55776650902102433.33296306581510.0382.34340259126965452.660927520133270.0385.1290386735183452.660927520133270.0387.91467475576692452.660927520133270.0390.70031083801557452.660927520133270.0393.4859469202642452.660927520133270.0296.27158300251284453.466259372396560.0299.05721908476147453.466259372396560.03101.8428551670101452.660927520133270.03104.62849124925873452.660927520133272018-03-26T03:43:34Z0.03107.41412733150737452.660927520133270.02110.199763413756453.466259372396560.02112.98539949600465453.466259372396560.03115.77103557825328452.660927520133270.26118.55667166050192434.13829491807830.01121.34230774275055454.27159122465980.02124.1279438249992453.466259372396560.01126.91357990724782454.27159122465980.04129.69921598949648451.855595667870030.03132.4848520717451452.660927520133270.03135.27048815399377452.660927520133275.52138.056124236242410.533740627603550.03140.841760318491452.660927520133270.03143.62739640073963452.660927520133270.04146.41303248298829451.855595667870030.03149.19866856523691452.660927520133270.02151.98430464748554453.466259372396562018-03-26T03:46:31Z0.03154.76994072973417452.660927520133270.01157.55557681198283454.27159122465985.53160.341212894231469.7284087753401990.02163.1268489764801453.466259372396560.04165.91248505872872451.855595667870030.02168.69812114097738453.466259372396560.03171.483757223226452.660927520133270.03174.26939330547464452.660927520133270.26177.05502938772327434.13829491807830.27179.84066546997192433.33296306581510.02182.62630155222055453.466259372396560.03185.41193763446918452.660927520133270.02188.19757371671778453.466259372396560.02190.98320979896644453.466259372396560.02193.76884588121507453.466259372396560.02196.5544819634637453.466259372396560.02199.34011804571233453.466259372396562018-03-26T03:49:28Z0.03202.125754127961452.660927520133270.02204.91139021020962453.466259372396560.03207.69702629245825452.660927520133270.02210.48266237470688453.466259372396565.53213.268298456955539.7284087753401990.02216.05393453920416453.466259372396560.03218.8395706214528452.660927520133270.02221.62520670370142453.466259372396560.03224.41084278595008452.660927520133270.03227.1964788681987452.660927520133270.04229.98211495044734451.855595667870030.04232.76775103269597451.855595667870030.27235.55338711494463433.33296306581510.03238.33902319719323452.660927520133270.02241.12465927944186453.466259372396560.02243.9102953616905453.466259372396560.27246.69593144393917433.33296306581512018-03-26T03:52:24Z0.02249.48156752618777453.466259372396560.02252.2672036084364453.466259372396560.03255.05283969068506452.660927520133275.52257.838475772933710.533740627603550.02260.6241118551824453.466259372396560.02263.40974793743095453.466259372396560.02266.1953840196796453.466259372396565.53268.981020101928279.7284087753401990.02271.76665618417684453.466259372396560.02274.55229226642547453.466259372396560.27277.3379283486741433.33296306581515.52280.123564430922810.533740627603550.02282.9092005131714453.466259372396560.02285.69483659542004453.466259372396560.02288.4804726776687453.466259372396560.02291.26610875991736453.466259372396560.26294.05174484216593434.13829491807832018-03-26T03:55:32Z0.02296.8373809244146453.466259372396560.03299.6230170066632452.660927520133270.03302.4086530889119452.660927520133270.03305.1942891711605452.660927520133271.53307.9799252534092331.86114968064430.03310.76556133565776452.660927520133270.03313.5511974179064452.660927520133270.02316.336833500155453.466259372396560.02319.12246958240365453.466259372396560.02321.9081056646523453.466259372396560.02324.6937417469009453.466259372396560.02327.4793778291496453.466259372396560.02330.2650139113982453.466259372396560.03333.05064999364686452.660927520133275.53335.83628607589559.7284087753401990.02338.6219221581441453.466259372396560.02341.40755824039275453.466259372396562018-03-26T03:58:31Z0.03344.1931943226414452.660927520133275.53346.978830404899.7284087753401990.02349.7644664871387453.466259372396560.02352.5501025693873453.466259372396560.04355.33573865163595451.855595667870030.04358.1213747338846451.855595667870035.54360.907010816133158.9230769230769620.03363.69264689838184452.660927520133270.02366.4782829806304453.466259372396560.02369.2639190628791453.466259372396560.04372.0495551451277451.855595667870030.02374.8351912273764453.466259372396560.02377.620827309625453.466259372396560.03380.4064633918737452.660927520133270.03383.19209947412224452.660927520133270.03385.97773555637093452.660927520133270.02388.7633716386195453.466259372396562018-03-26T04:01:33Z0.05391.5490077208682451.05026381560685.53394.33464380311689.7284087753401995.53397.12027988536559.7284087753401990.02399.9059159676141453.466259372396560.04402.69155204986276451.855595667870030.03405.47718813211134452.660927520133270.02408.26282421436453.466259372396560.02411.0484602966086453.466259372396560.07413.8340963788572449.439600111080270.03416.6197324611059452.660927520133270.02419.40536854335454453.466259372396561.52422.19100462560317332.666481532907540.03424.9766407078518452.660927520133270.04427.76227679010043451.855595667870030.27430.54791287234906433.33296306581510.02433.3335489545977453.466259372396560.02436.1191850368463453.466259372396562018-03-26T04:04:37Z0.03438.904821119095452.660927520133270.03441.69045720134363452.660927520133270.04444.47609328359226451.855595667870030.03447.2617293658409452.660927520133270.02450.0473654480895453.466259372396560.02452.83300153033815453.466259372396560.02455.6186376125868453.466259372396560.02458.4042736948354453.466259372396560.02461.18990977708404453.466259372396560.02463.9755458593327453.466259372396565.53466.761181941581369.7284087753401990.03469.54681802383452.660927520133270.02472.33245410607856453.466259372396560.02475.11809018832724453.466259372396565.52477.903726270575910.533740627603550.04480.6893623528245451.855595667870030.02483.47499843507313453.466259372396562018-03-26T04:07:39Z5.53486.26063451732189.7284087753401990.03489.04627059957045452.660927520133275.53491.83190668181919.7284087753401990.03494.61754276406765452.660927520133275.53497.40317884631639.7284087753401990.03500.188814928565452.660927520133275.53502.97445101081359.7284087753401990.02505.7600870930622453.46625937239656By IPBy Hostname \ No newline at end of file diff --git a/posts/resolving-a-dns-issue/index.html b/posts/resolving-a-dns-issue/index.html new file mode 100644 index 0000000..aa8d05e --- /dev/null +++ b/posts/resolving-a-dns-issue/index.html @@ -0,0 +1,139 @@ +Resolving A DNS Issue
Me

Resolving A DNS Issue

Haha. Get it? Resolving a DNS issue. OK, that was bad. You don't have to read anymore, but I'm SOA into this. You might even say I'm in the zone. I think it's gonna be A great read, so consider sticking around, 'cuz there's no TLD;R.

Symptoms

A few months ago I switched ISPs to try to support a smaller regional ISP instead of one of the monopolistic behemoths. The transition was largely pain free, but after a little while I noticed that something wasn't right on the new network. Periodically my web browsing would be much slower, but it didn't seem to be specific to any particular site. Furthermore, reloading a site I was having a problem with was usually without issue.

I let this go on for longer than I'm comfortable admitting, but finally got fed up when I was trying to curl pages from the same site in a loop. I noticed that the same pages took different amounts of time to fetch between runs of the loop, which seemed very wrong to me since they were static pages.

Troubleshooting

My first step in narrowing down the problem was to pick a website known for its speed and reliability see if I had problems there. I curled google.com in a loop every 10 seconds and sure enough, every so often the problem would happen there too. The connection would just hang for a few seconds, but eventually complete.

Next, I wondered how I could narrow it down further to make sure the issue wasn't caused on my home network. Then I remembered that my gateway could be accessed by its IP of 192.168.1.1 and the hostname router.asus.com. I pointed curl at my router's hostname. The same problem was happening with a domain that resolved to my local network!

OK, so it didn't matter whether I was leaving my network or not. Traffic was slooow either way. I changed my curl loop to point at my router by its IP. No issues whatsoever. At this point I recalled the age-old sysadmin wisdom. It's always DNS!

It's Always DNS
It's Always DNS. Converted to SVG from a Reddit post by u/tehrabbitt. View full size.

Gathering Data

Clearly this was an intermittent DNS issue. I could have gone right into fixing the problem, but I wanted to know just how bad it was. In order to do that I needed data. I collected timing information for simultaneous HTTP GET requests resolving by IP address and by hostname for the same domain over a 2 hour period with the help of this script.

#!/usr/bin/env zsh
+
+# 720 * 10s = 2hrs
+: ${INTERVAL=10}
+: ${NUM_TESTS=720}
+
+: ${ROUTER_IP="192.168.1.1"}
+: ${ROUTER_HOSTNAME="router.asus.com"}
+
+: ${OUTPUT_FILE="./http-test-results.csv"}
+
+function now() {
+    # ISO 8601 FTW!
+    date -u "+%Y-%m-%dT%H:%M:%SZ"
+}
+
+function http-test() {
+    local host=$1
+    /usr/bin/time -f "%e" curl -s $host > /dev/null
+}
+
+function run-tests() {
+    # Create CSV header.
+    echo "Datetime,Seconds for HTTP GET by IP,Seconds for HTTP GET by Hostname"
+
+    # A sample CSV row will look like
+    # 2018-03-26T03:37:29Z,0.01,0.02
+
+    # For each iteration spit out the current time, how long it takes to curl
+    # with an IP address, and how long it takes to curl with a hostname.
+    for t in {1..${NUM_TESTS}}
+    do
+        # Redirect stderr to stdout with 2>&1 to capture GNU time's output.
+        echo "$(now),$(http-test ${ROUTER_IP} 2>&1),$(http-test ${ROUTER_HOSTNAME} 2>&1)"
+        sleep ${INTERVAL}
+    done
+}
+
+run-tests >> ${OUTPUT_FILE}
+

That was going to give me the data I needed, but reading raw CSVs is a pain. I don't do graphing very often, but I figured visualization would help, so I did a quick survey of Python graphing libraries and found the Pygal library. It turned out to be much more flexible than Matplotlib for my needs and gave me great looking SVGs customized to match the style of this site.

#!/usr/bin/env python3.6
+
+import sys
+from typing import Iterator
+
+import pandas as pd
+from pygal import Line
+from pygal.style import Style
+
+
+class GruvboxStyle(Style):
+    """ A gruvbox-inspired Pygal style. """
+
+    background = '#282828'
+    plot_background = '#1d2021'
+    foreground = '#fdf4c1'
+    foreground_strong = '#fdf4c1'
+    foreground_subtle = '#fdf4c1'
+    colors = ('#8ec07c', '#fa5c4b')
+
+
+def dilute_datetimes(datetimes: pd.Series, factor: int) -> Iterator[str]:
+    """ Lots of datetimes overlap and become unreadable, make some space. """
+    dilute = lambda t: t[1] if t[0] % factor == 0 else ''
+    yield from map(dilute, enumerate(datetimes))
+
+
+def generate_chart(data: pd.DataFrame) -> Line:
+    line_chart = Line(
+        js=(),  # The tooltips are really nice, but I don't want any JS.
+        style=GruvboxStyle,
+        x_label_rotation=30
+    )
+
+    # Water those datetimes down so they don't overlap and we can read them!
+    datetimes = data['Datetime']
+    dilution_factor = datetimes.shape[0] // 10
+    datetimes = dilute_datetimes(datetimes, factor=dilution_factor)
+
+    line_chart.title = 'HTTP GET by IP vs. HTTP GET by Hostname'
+    line_chart.y_title = 'Seconds'
+    line_chart.x_labels = datetimes
+    line_chart.add(
+        title='By IP',
+        values=data['Seconds for HTTP GET by IP']
+    )
+    line_chart.add(
+        title='By Hostname',
+        values=data['Seconds for HTTP GET by Hostname']
+    )
+    return line_chart
+
+
+def main(argv: list) -> None:
+    data = pd.read_csv(sys.argv[1])
+    output = sys.argv[2]
+    chart = generate_chart(data)
+    chart.render_to_file(output)
+
+
+if __name__ == '__main__':
+    main(sys.argv)
+

Now I could run a few simple commands and get a chart to help me understand what I was dealing with.

#!/usr/bin/env zsh
+export OUTPUT_FILE=before.csv
+zsh http-test.zsh && python3.6 chart.py $OUTPUT_FILE ${OUTPUT_FILE:r}.svg
+

I ended up truncating the results for the graphs below to only show 30 minutes of the 2 hour data to help readability, but the data looks just about the same anyway.

HTTP GET by IP vs. HTTP GET by Hostname Before the Fix
HTTP GET by IP vs. HTTP GET by Hostname before the fix. View full size.

Wow! That's really bad. You can't even see the By IP series because it's dwarfed by these enormous outliers from By Hostname. These were local DNS requests. There's no reason they should have been taking greater than 5 seconds to complete.

Resolution

I logged into my router's administrative console and sure enough, right there in the logs was a DNS error.

Mar 25 21:44:03 dnsmasq[31700]: nameserver 208.76.152.1 refused to do a recursive query
+

I was still using the default upstream DNS given by my ISP and it was refusing recursive DNS queries. My OS wasn't running a caching DNS resolver locally, so I was totally at the mercy of my router. When the router went out to lunch trying to satisfy a DNS request my connections did too.

Curiously, only one of the two nameservers my router was configured to use appeared in the logs with that error message. Not knowing exactly how the router manufacturer configured DNSMasq or how the upstream DNS servers were configured my working hypothesis is that the one refusing recursive queries was the primary and requests that weren't cached always had to go to two servers.

My assumption is that DNSMasq is pretty badly misconfigured on my router as I wouldn't expect router.asus.com to even need a recursive query. Ultimately I should probably run my own local DNS. Maybe I'll even take trust-dns for a spin. That would be a long-term project though, so I opted instead to simply change my nameservers to Google's public ones. The problem finally went away once recursive queries were being supported!

Measuring Success

Once I was confident I'd fixed the problem I ran the same chart generation commands as before, being careful to set OUTPUT_FILE="after.csv" so as not to overwrite my previous data.

HTTP GET by IP vs. HTTP GET by Hostname After the Fix
HTTP GET by IP vs. HTTP GET by Hostname after the fix. View full size.

Much better! Even the most extreme request was under two tenths of a second. Just how much did the situation improve though? I wrote one last script to find out.

#!/usr/bin/env zsh
+function percent_failures() {
+    local file="${1}"
+    local pattern="${2}"
+    local num_failures=$(grep $pattern $file | wc -l)
+    local num_results=$(wc -l $file | cut -d\  -f1)
+    echo "scale=4; ${num_failures} / ${num_results} * 100" | bc -l
+}
+
+percent_failures before.csv '5\.'
+percent_failures after.csv '5\.'
+

Prior to fixing the issue 11.66% of all requests took longer than 5 seconds. After applying the fix that dropped to 0.13%. I'd say that's a noticeable improvement!

Bonus

In case you were curious about how the It's Always DNS diagram above was generated, here's the Graphviz source and dot command to reproduce it.

// File: its-always-dns.dot
+
+digraph G {
+    graph [ splines=ortho nodesep=2 fontcolor="#fdf4c1" ];
+    node [ style=filled color="#282828" fillcolor="#282828", fontcolor="#fdf4c1" ];
+    edge [ color="#282828" fontcolor="#282828" ];
+
+    is_it_dns [shape=diamond label="Is it DNS?"];
+    check_bad_records [shape=box label="Check DNS for Bad Records"];
+    is_it_resolved [shape=diamond label="Is the Issue Resolved?"];
+    you_missed_something [shape=box label="You Obviously Missed Something"];
+    its_always_dns [shape=box label="It's Always DNS"];
+
+    is_it_dns -> check_bad_records [xlabel="Yes"];
+    check_bad_records -> is_it_resolved;
+    is_it_resolved -> its_always_dns [xlabel="Yes"];
+    is_it_resolved -> you_missed_something [xlabel="No"];
+    you_missed_something -> check_bad_records;
+    you_missed_something -> is_it_dns [xlabel="No"];
+}
+
#!/usr/bin/env zsh
+dot -Tsvg its-always-dns.dot -o its-always-dns.svg
+

In the interest of making this post maximally reproducible I've put all the raw data and scripts used in this post in a GitHub repo for anyone who wants to play with them on their own network.

\ No newline at end of file diff --git a/posts/resolving-a-dns-issue/its-always-dns.svg b/posts/resolving-a-dns-issue/its-always-dns.svg new file mode 100644 index 0000000..869d5d9 --- /dev/null +++ b/posts/resolving-a-dns-issue/its-always-dns.svg @@ -0,0 +1,83 @@ + + + + + + +G + + + +is_it_dns + +Is it DNS? + + + +check_bad_records + +Check DNS for Bad Records + + + +is_it_dns->check_bad_records + + +Yes + + + +is_it_resolved + +Is the Issue Resolved? + + + +check_bad_records->is_it_resolved + + + + + +you_missed_something + +You Obviously Missed Something + + + +is_it_resolved->you_missed_something + + +No + + + +its_always_dns + +It's Always DNS + + + +is_it_resolved->its_always_dns + + +Yes + + + +you_missed_something->is_it_dns + + +No + + + +you_missed_something->check_bad_records + + + + + diff --git a/posts/stateful-callbacks-in-python/index.html b/posts/stateful-callbacks-in-python/index.html new file mode 100644 index 0000000..f941328 --- /dev/null +++ b/posts/stateful-callbacks-in-python/index.html @@ -0,0 +1,65 @@ +Stateful Callbacks in Python
Me

Stateful Callbacks in Python

If you're unfamiliar with what a callback is, don't worry, we can sort that out quickly. If callbacks are old hat for you you might want to skip to the interesting bit.

Simply put, a callback is a function that is passed as an argument to another function which may execute it.

Take, for example, these functions:

def bar():
+    return "I'm the callback!"
+
+def foo(func):
+    return func()
+

If we call foo like this

>>> foo(bar)
+"I'm the callback!"
+

then bar is a callback.

Why Should I Use A Callback?

There are many reasons to use callbacks. For me, the most compelling is customization. Let's take a look at a Python built-in as an example. Say we have a list of users as dictionaries with a name and an age:

users = [
+    dict(age=77, name='John Cleese'),
+    dict(age=74, name='Eric Idle'),
+]
+

Imagine that we want to sort our users. If we had just a list of ages or a list of names we could easily do this with the built-in sorted function, but by default Python has no idea how to compare our dictionaries during sorting.

>>> sorted(users)
+Traceback (most recent call last):
+  File "<stdin>", line 1, in <module>
+TypeError: unorderable types: dict() < dict()
+

Should it sort by age? By name? We need to tell Python how this should be done. Fortunately Python provides and the sorted function has a keyword argument called key that takes, you guessed it, a callback. Let's create some of our own!

def by_age(user):
+    return user['age']
+
+def by_name(user):
+    return user['name']
+

Armed with these callbacks we can sort our users.

>>> sorted(users, key=by_age)
+[{'age': 74, 'name': 'Michael Palin'}, {'age': 77, 'name': 'John Cleese'}]
+>>> sorted(users, key=by_name)
+[{'age': 77, 'name': 'John Cleese'}, {'age': 74, 'name': 'Michael Palin'}]
+

Since the sorted function takes a callback for the key argument we are free to customize its behavior. All we have to do is define a function that returns the key we intend to sort by and as long as that's an orderable type Python will take care of the rest.

What Does It Mean to Have State?

So, by now we have something of an idea of what callbacks are, how we can use them, and why, but what's the point of state? State is most easily described as a memory of prior events. This is the core of what every program does and we use it all the time, even if we don't realize it. Heck, even saving a variable involves keeping track of state.

>>> baz = 1  # The Python interpreter is now tracking the state of 'baz'.
+>>> print(baz)  # We can recall that state at a later point.
+1
+

Basically, we need state if we care to remember what happened previously so that we can make decisions about what to do next.

What Normally Happens to State Inside a Callback?

In our first callback function we didn't define any names. To demonstrate what typically happens to state inside the scope of a callback let's make a function that creates some state.

def quux():
+    plugh = "xyzzy"
+    return plugh
+

When we execute this function we get the expected result.

>>> quux()
+'xyzzy'
+

After the function is executed we can see that the plugh name is not defined.

>>> plugh
+Traceback (most recent call last):
+  File "<stdin>", line 1, in <module>
+NameError: name 'plugh' is not defined
+

This is because when the function is finished executing its frame is removed from the call stack along with any locally defined variables. By itself our callback can't remember anything.

Stateful Callbacks

Alright, so we know what callbacks are, we know what state is. How can we combine the two to make a callback that retains its state? As we saw above we can't rely on any state that we define inside our callback. The trick to making a stateful callback is to rely on names bound to an external scope.

To motivate creating a stateful callback let's say that we still want to sort users like we did above, only now we have 1 Million users. It's going to take a while to sort those users, so it would be nice to have a progress report so we know something is still happening, maybe once per 10,000 users.

Using Functions

To use names bound to an external scope with a plain ol' function as our callback we'll need to take advantage of closures (which could be an entirely separate post). Here's a function that allows us to use our original by_age and by_name sorters while still giving us progress.

def sort_reporter(func):
+    state = dict(count=0)  # We can't just call this 'count'...
+
+    def _sort_reporter(user):
+        state['count'] += 1  # Because we'd get an UnboundLocalError here.
+        if state['count'] % 10000 == 0:
+            print("Sorted {count} users.".format(count=state['count']))
+        return func(user)
+
+    return _sort_reporter
+

We can use it like so.

>>> sorted_users = sorted(users, key=sort_reporter(by_name))
+Sorted 10000 users.
+Sorted 20000 users.
+# Lots more of this...
+

How does it work? The key is the state dictionary. It lets us keep a mutable reference to a name defined outside the scope of the actual reporter function, _sort_reporter. As the sorted built-in is processing our users each new call to _sort_reporter still gets to refer to the original state.

Note: We could avoid having a state dictionary by using Python 3's nonlocal keyword, but then I'd miss an opportunity in the bonus section.

Using Classes

If the functional approach doesn't suit you we can also tackle this problem from an object-oriented angle. Python lets classes define a __call__ method which makes them callable. This isn't strictly necessary for an OOP approach, but when we're making callbacks it's nice to be able to treat our instances as functions.

class SortReporter:
+    def __init__(self, func):
+        self.func = func
+        self.count = 0
+
+    def __call__(self, user):
+        self.count += 1
+        if self.count % 10000 == 0:
+            print("Sorted {count} users.".format(count=self.count))
+        return self.func(user)
+

Just as easy to use as our functional option.

>>> sorted_users = sorted(users, key=SortReporter(by_age))
+Sorted 10000 users.
+Sorted 20000 users.
+# One eternity later...
+

Conceptually, this works for much the same reason that the functional approach does. The SortReporter instance and all its associated state lives on because the sorted built-in is carrying around a reference to it and it just pretends to be a plain ol' function whenever sorted needs it to be one.

Which Should I Use?

Neither approach is any more or less valid than the other. For this particular example there isn't much more code or complexity either way. I generally regard functions as being simpler than classes, so I prefer those when possible, but classes also provide good structure for more complex callbacks. Try them both!

Bonus

As homework a bonus, try instantiating a SortReporter and examining its __dict__ attribute. Meditate on what you find there and how it relates to the state dictionary in the functional approach.

If you get really bold and want to try for extra credit assign the return value of the sort_reporter function to some variable and examine its __closure__ attribute. This may help you explain why the state dictionary doesn't disappear after the sort_reporter function is called.

\ No newline at end of file diff --git a/robots.txt b/robots.txt new file mode 100644 index 0000000..d76b417 --- /dev/null +++ b/robots.txt @@ -0,0 +1,4 @@ +User-agent: * +Disallow: +Allow: / +Sitemap: https://AliAhmadi2004.ir/sitemap.xml diff --git a/rss.xml b/rss.xml new file mode 100644 index 0000000..813bc2d --- /dev/null +++ b/rss.xml @@ -0,0 +1,154 @@ + + + + Ali Ahmadi + https://AliAhmadi2004.ir + Ali Ahmadi personal blog. + Zola + en + + Sun, 31 Dec 2023 04:56:00 -0800 + + avatar.png + Sun, 31 Dec 2023 04:56:00 -0800 + Unknown + https://AliAhmadi2004.ir/posts/avatar-png/ + https://AliAhmadi2004.ir/posts/avatar-png/ + <p>No, not that Avatar. And not the other one either. This post is about +<code>avatar.png</code>, a handful of lines of PHP that have inspired me for a long time.</p> +<p>Around 2011 or 2012 a friend of mine, <a href="https://andrew.kvalhe.im/">Andrew Kvalheim</a>, blew +my mind when he made his Skype profile picture display the IP address of the +computer I was using. It might have looked a bit like this.</p> + + + + Parsing TFTP in Rust + Sat, 31 Dec 2022 16:45:00 -0800 + Unknown + https://AliAhmadi2004.ir/posts/parsing-tftp-in-rust/ + https://AliAhmadi2004.ir/posts/parsing-tftp-in-rust/ + <p>Several years ago I did a take-home interview which asked me to write a <a href="https://en.wikipedia.org/wiki/Trivial_File_Transfer_Protocol">TFTP</a> +server in <a href="https://go.dev/">Go</a>. The job wasn't the right fit for me, but I enjoyed the +assignment. Lately, in my spare time, I've been tinkering with a <a href="https://www.rust-lang.org/">Rust</a> +implementation. Here's what I've done to parse the protocol.</p> + + + + A Fresh Coat of Paint + Fri, 31 Dec 2021 21:04:00 -0800 + Unknown + https://AliAhmadi2004.ir/posts/a-fresh-coat-of-paint/ + https://AliAhmadi2004.ir/posts/a-fresh-coat-of-paint/ + <p>I'm starting the new year with a new job. To paraphrase a friend, &quot;it's just +moving from one <code>$BIGCORP</code> to another&quot;, but it's still exciting. I worked my +last gig for 5 years, so I'm nervous, but also very ready to do something new. +While I'm doing one new thing I might as well do another. Taking some time off +between jobs has given me enough breathing room to redo my website.</p> + + + + node.example.com Is An IP Address + Mon, 28 Dec 2020 06:13:05 -0800 + Unknown + https://AliAhmadi2004.ir/posts/node-example-com-is-an-ip-address/ + https://AliAhmadi2004.ir/posts/node-example-com-is-an-ip-address/ + <p>Hello! Welcome to the once-yearly blog post! This year I'd like to examine the +most peculiar bug I encountered at work. To set the stage, let's start with a +little background.</p> + + + + Deprecating Layabout + Tue, 31 Dec 2019 17:29:00 -0800 + Unknown + https://AliAhmadi2004.ir/posts/deprecating-layabout/ + https://AliAhmadi2004.ir/posts/deprecating-layabout/ + <p>Since <a href="https://layabout.readthedocs.io/en/latest">Layabout</a> launched last year it has been downloaded 5,755 times, gotten +16 stars on GitHub, been used by a Portuguese startup to teach a +<a href="https://github.com/ricardojusto/haskell-workshop/blob/f96aa901700d0b10dad35f391a94017f502fb42a/s01/e10/bot/receive">Haskell workshop</a>, and received a <a href="https://twitter.com/roach/status/1019279698092744705">Twitter shout-out</a> from <a href="https://twitter.com/roach">@roach</a>, one of +the core contributors to the <a href="https://github.com/slackapi/python-slackclient">official Python Slack client</a>. During that time the +official client library also got a <strong>lot</strong> better! So much better, in fact, +that I've decided to deprecate Layabout.</p> + + + + Announcing Layabout + Sat, 30 Jun 2018 00:16:52 -0700 + Unknown + https://AliAhmadi2004.ir/posts/announcing-layabout/ + https://AliAhmadi2004.ir/posts/announcing-layabout/ + <p>Today I'm announcing <a href="https://layabout.readthedocs.io/en/latest/">Layabout</a>, my first official Python library. Layabout is +a small event handling library on top of the +<a href="https://api.slack.com/rtm">Slack Real Time Messaging (RTM) API</a>. You can get it right now +on <a href="https://pypi.org/project/layabout">PyPI</a>.</p> + + + + Resolving A DNS Issue + Sun, 24 Jun 2018 13:21:00 -0700 + Unknown + https://AliAhmadi2004.ir/posts/resolving-a-dns-issue/ + https://AliAhmadi2004.ir/posts/resolving-a-dns-issue/ + <p>Haha. Get it? <em>Resolving</em> a DNS issue. OK, that was bad. You don't have to read +anymore, but I'm <code>SOA</code> into this. You might even say I'm in the zone. I think +it's gonna be <code>A</code> great read, so consider sticking around, 'cuz there's no +TLD;R.</p> + + + + Stateful Callbacks in Python + Mon, 10 Jul 2017 07:54:00 -0800 + Unknown + https://AliAhmadi2004.ir/posts/stateful-callbacks-in-python/ + https://AliAhmadi2004.ir/posts/stateful-callbacks-in-python/ + <p>If you're unfamiliar with what a callback is, don't worry, we can sort that out +quickly. If callbacks are old hat for you you might want to skip to +<a href="https://AliAhmadi2004.ir/posts/stateful-callbacks-in-python/#stateful-callbacks">the interesting bit</a>.</p> +<p>Simply put, a callback is a function that is passed as an argument to +another function which may execute it.</p> + + + + gutenberg init blog + Sun, 18 Jun 2017 15:17:00 -0800 + Unknown + https://AliAhmadi2004.ir/posts/gutenberg-init-blog/ + https://AliAhmadi2004.ir/posts/gutenberg-init-blog/ + <p>When I first created this site I wanted to get it live as quickly as possible. +<a href="https://hexo.io/">Hexo</a>, a blogging framework written in Node.js, seemed like the perfect +tool. At the time I was rather interested in Node.js, so it seemed natural to +use a framework rooted in that community.</p> +<p>By the time of my last post I'd become increasingly disinterested in Node.js +and much more interested in Rust and its community. It was mostly +procrastination, but I convinced myself that using a tool written in a language +I didn't use often directly contributed to the paucity of posts here, so I +finally decided to ditch Hexo.</p> + + + + git filter-wat + Thu, 07 Apr 2016 16:06:33 -0800 + Unknown + https://AliAhmadi2004.ir/posts/git-filter-wat/ + https://AliAhmadi2004.ir/posts/git-filter-wat/ + <p>Welcome to this year's annual blog post!</p> +<p>I've been signing <code>git</code> commits for my <a href="https://github.com/reillysiemens/dotfiles">dotfiles</a> repository since its +inception in October of last year, so I was excited to see that GitHub recently +added <a href="https://github.com/blog/2144-gpg-signature-verification">GPG signature verification</a>. All you have to do is upload your +<a href="https://github.com/reillysiemens.gpg">public key</a> to GitHub and you'll be verifying commits like a champ. Or so I +thought…</p> + + + + hexo init blog + Fri, 24 Apr 2015 01:57:13 -0800 + Unknown + https://AliAhmadi2004.ir/posts/hexo-init-blog/ + https://AliAhmadi2004.ir/posts/hexo-init-blog/ + <p>I've been wanting to start a blog for a long time now. Today I'm pulling the +trigger on that with a simple <code>hexo init blog</code>. Well, it wasn't <em>that</em> simple, +so I feel like it's worth talking about a few of the complications I had.</p> + + + + diff --git a/self.jpg b/self.jpg new file mode 100644 index 0000000..c08f130 Binary files /dev/null and b/self.jpg differ diff --git a/site.css b/site.css new file mode 100644 index 0000000..a86f8cc --- /dev/null +++ b/site.css @@ -0,0 +1 @@ +@import url("https://fonts.googleapis.com/css?family=Noto+Sans|Noto+Serif|Noto+Sans+Mono");:root{--default-serif-font: "Noto Serif", serif;--default-sans-font: "Noto Sans", sans-serif;--default-mono-font: "Noto Sans Mono", monospace;--default-font-size: 16px;--header-font-size: 2.0em;--default-line-height: 1.5;--background-color: #1d2021;--foreground-color: #ebdbb2;--link-color: #458588;--muted-color: #928374;--primary-accent-color: #12cc44;--secondary-accent-color: #689d6a;--code-background-color: #282828;--inline-code-background-color: #3c3836}.accent{color:var(--primary-accent-color)}html,body{height:100%}body{display:flex;flex-direction:column;margin:0;line-height:var(--default-line-height);font-size:var(--default-font-size);font-family:var(--default-serif-font);background-color:var(--background-color);color:var(--foreground-color);border-top:.5em solid var(--primary-accent-color)}main{flex:1 0 auto;}#content{max-width:48em;margin-left:auto;margin-right:auto;padding-right:1.4em;padding-left:1.4em;padding-bottom:1.4em}header{overflow:hidden;margin-top:1.4em;margin-bottom:1.4em;margin-left:1em;margin-right:1em}header>h1{text-align:center;font-size:var(--header-font-size);font-family:var(--default-sans-font);padding-top:2.8rem;margin-top:0;margin-bottom:.7rem}header .post-meta{text-align:center;padding-top:.5em;padding-bottom:.9em;margin-bottom:0}.post-meta time{color:var(--muted-color)}h2,h3,h4,h5,h6{padding-top:1.2rem;margin-bottom:.2rem}p{padding-top:.4em;margin-bottom:1em}blockquote{font-family:var(--default-sans-font);font-style:italic;margin:0 1.25rem;padding-left:1.25rem;padding-right:1.25rem;border-left:3px solid var(--secondary-accent-color)}a{color:var(--link-color);text-decoration:none}:where(h2,h3,h4,h5,h6)>a[href^="#"]::before{content:"#";display:inline-block;width:1.2rem;margin-left:-1.2rem;opacity:0}:where(h2,h3,h4,h5,h6):hover>a[href^="#"]::before{opacity:1}:where(h2,h3,h4,h5,h6):target>a[href^="#"]::before{opacity:1;color:var(--primary-accent-color)}img,video{width:100%;height:auto}figure{margin:0}figcaption{font-size:.85rem;color:var(--muted-color)}pre{display:block;padding:.25rem 1.25rem;font-size:85%;border-left:.25rem solid var(--primary-accent-color)}code{padding:0 .4rem;background-color:var(--inline-code-background-color);border-radius:2px}pre,code{overflow:auto;font-family:var(--default-mono-font)}pre>code{padding:0;background-color:var(--code-background-color);border-radius:unset}table{display:block;overflow-x:auto;margin-left:auto;margin-right:auto;text-align:left;border-collapse:collapse}th[scope=col]{padding-top:.4em;border-bottom:3px solid var(--secondary-accent-color)}th[scope=row]{border-right:3px solid var(--secondary-accent-color)}td{text-align:center;border:1px solid var(--muted-color)}th,td{vertical-align:bottom;padding-left:.4em;padding-right:.4em;border:1px solid var(--muted-color)}footer{border-top:.5em solid var(--primary-accent-color);text-align:center;background-color:var(--code-background-color);padding:1.4em;flex-shrink:0;}footer a{color:var(--secondary-accent-color)}footer>nav>a{margin:0 .4em}footer p{padding-top:0}footer p,footer a{font-family:var(--default-sans-font);font-variant-caps:all-small-caps}@media (min-width: 470px){:root{--default-font-size: 17px}header{margin-left:2.8em;margin-right:2.8em}}@media (min-width: 625px){:root{--default-font-size: 18px}}@media (min-width: 802px){:root{--default-font-size: 19px}}@media (min-width: 1003px){:root{--default-font-size: 20px}}@media (min-width: 1225px){:root{--default-font-size: 21px}:where(h2,h3,h4,h5,h6)>a[href^="#"]::before{width:1.8rem;margin-left:-1.8rem}}@media (min-width: 1496px){:root{--default-font-size: 22px}}.center{display:flex;justify-content:center;align-items:center;padding-top:20px;height:100vh}.center img{width:50vh;border-radius:10%} \ No newline at end of file diff --git a/sitemap.xml b/sitemap.xml new file mode 100644 index 0000000..b760660 --- /dev/null +++ b/sitemap.xml @@ -0,0 +1,137 @@ + + + + https://AliAhmadi2004.ir/ + + + https://AliAhmadi2004.ir/contact/ + + + https://AliAhmadi2004.ir/posts/ + + + https://AliAhmadi2004.ir/posts/a-fresh-coat-of-paint/ + 2021-12-31T21:04:00-08:00 + + + https://AliAhmadi2004.ir/posts/announcing-layabout/ + 2018-06-30T00:16:52-07:00 + + + https://AliAhmadi2004.ir/posts/avatar-png/ + 2023-12-31T04:56:00-08:00 + + + https://AliAhmadi2004.ir/posts/deprecating-layabout/ + 2019-12-31T17:29:00-08:00 + + + https://AliAhmadi2004.ir/posts/git-filter-wat/ + 2016-04-07T16:06:33-08:00 + + + https://AliAhmadi2004.ir/posts/gutenberg-init-blog/ + 2017-06-18T15:17:00-08:00 + + + https://AliAhmadi2004.ir/posts/hexo-init-blog/ + 2015-04-24T01:57:13-08:00 + + + https://AliAhmadi2004.ir/posts/node-example-com-is-an-ip-address/ + 2020-12-28T06:13:05-08:00 + + + https://AliAhmadi2004.ir/posts/parsing-tftp-in-rust/ + 2022-12-31T16:45:00-08:00 + + + https://AliAhmadi2004.ir/posts/resolving-a-dns-issue/ + 2018-06-24T13:21:00-07:00 + + + https://AliAhmadi2004.ir/posts/stateful-callbacks-in-python/ + 2017-07-10T07:54:00-08:00 + + + https://AliAhmadi2004.ir/tags/ + + + https://AliAhmadi2004.ir/tags/blogging/ + + + https://AliAhmadi2004.ir/tags/callbacks/ + + + https://AliAhmadi2004.ir/tags/css/ + + + https://AliAhmadi2004.ir/tags/dns/ + + + https://AliAhmadi2004.ir/tags/git/ + + + https://AliAhmadi2004.ir/tags/github/ + + + https://AliAhmadi2004.ir/tags/gutenberg/ + + + https://AliAhmadi2004.ir/tags/hacks/ + + + https://AliAhmadi2004.ir/tags/hexo/ + + + https://AliAhmadi2004.ir/tags/html/ + + + https://AliAhmadi2004.ir/tags/http/ + + + https://AliAhmadi2004.ir/tags/internet/ + + + https://AliAhmadi2004.ir/tags/layabout/ + + + https://AliAhmadi2004.ir/tags/networking/ + + + https://AliAhmadi2004.ir/tags/nom/ + + + https://AliAhmadi2004.ir/tags/parsing/ + + + https://AliAhmadi2004.ir/tags/pgp/ + + + https://AliAhmadi2004.ir/tags/php/ + + + https://AliAhmadi2004.ir/tags/png/ + + + https://AliAhmadi2004.ir/tags/python/ + + + https://AliAhmadi2004.ir/tags/rust/ + + + https://AliAhmadi2004.ir/tags/slack/ + + + https://AliAhmadi2004.ir/tags/slow/ + + + https://AliAhmadi2004.ir/tags/testing/ + + + https://AliAhmadi2004.ir/tags/tftp/ + + + https://AliAhmadi2004.ir/tags/webdev/ + + diff --git a/tags/blogging/index.html b/tags/blogging/index.html new file mode 100644 index 0000000..085a076 --- /dev/null +++ b/tags/blogging/index.html @@ -0,0 +1 @@ +Tag - blogging
Me

Posts Tagged as "blogging"

gutenberg init blog

When I first created this site I wanted to get it live as quickly as possible. Hexo, a blogging framework written in Node.js, seemed like the perfect tool. At the time I was rather interested in Node.js, so it seemed natural to use a framework rooted in that community.

By the time of my last post I'd become increasingly disinterested in Node.js and much more interested in Rust and its community. It was mostly procrastination, but I convinced myself that using a tool written in a language I didn't use often directly contributed to the paucity of posts here, so I finally decided to ditch Hexo.

hexo init blog

I've been wanting to start a blog for a long time now. Today I'm pulling the trigger on that with a simple hexo init blog. Well, it wasn't that simple, so I feel like it's worth talking about a few of the complications I had.

\ No newline at end of file diff --git a/tags/callbacks/index.html b/tags/callbacks/index.html new file mode 100644 index 0000000..ae720e5 --- /dev/null +++ b/tags/callbacks/index.html @@ -0,0 +1 @@ +Tag - callbacks
Me

Posts Tagged as "callbacks"

Stateful Callbacks in Python

If you're unfamiliar with what a callback is, don't worry, we can sort that out quickly. If callbacks are old hat for you you might want to skip to the interesting bit.

Simply put, a callback is a function that is passed as an argument to another function which may execute it.

\ No newline at end of file diff --git a/tags/css/index.html b/tags/css/index.html new file mode 100644 index 0000000..c7c4b20 --- /dev/null +++ b/tags/css/index.html @@ -0,0 +1 @@ +Tag - CSS
Me

Posts Tagged as "CSS"

A Fresh Coat of Paint

I'm starting the new year with a new job. To paraphrase a friend, "it's just moving from one $BIGCORP to another", but it's still exciting. I worked my last gig for 5 years, so I'm nervous, but also very ready to do something new. While I'm doing one new thing I might as well do another. Taking some time off between jobs has given me enough breathing room to redo my website.

\ No newline at end of file diff --git a/tags/dns/index.html b/tags/dns/index.html new file mode 100644 index 0000000..395afd8 --- /dev/null +++ b/tags/dns/index.html @@ -0,0 +1 @@ +Tag - DNS
Me

Posts Tagged as "DNS"

Resolving A DNS Issue

Haha. Get it? Resolving a DNS issue. OK, that was bad. You don't have to read anymore, but I'm SOA into this. You might even say I'm in the zone. I think it's gonna be A great read, so consider sticking around, 'cuz there's no TLD;R.

\ No newline at end of file diff --git a/tags/git/index.html b/tags/git/index.html new file mode 100644 index 0000000..68d0678 --- /dev/null +++ b/tags/git/index.html @@ -0,0 +1 @@ +Tag - Git
Me

Posts Tagged as "Git"

git filter-wat

Welcome to this year's annual blog post!

I've been signing git commits for my dotfiles repository since its inception in October of last year, so I was excited to see that GitHub recently added GPG signature verification. All you have to do is upload your public key to GitHub and you'll be verifying commits like a champ. Or so I thought…

\ No newline at end of file diff --git a/tags/github/index.html b/tags/github/index.html new file mode 100644 index 0000000..5d6c31d --- /dev/null +++ b/tags/github/index.html @@ -0,0 +1 @@ +Tag - GitHub
Me

Posts Tagged as "GitHub"

git filter-wat

Welcome to this year's annual blog post!

I've been signing git commits for my dotfiles repository since its inception in October of last year, so I was excited to see that GitHub recently added GPG signature verification. All you have to do is upload your public key to GitHub and you'll be verifying commits like a champ. Or so I thought…

\ No newline at end of file diff --git a/tags/gutenberg/index.html b/tags/gutenberg/index.html new file mode 100644 index 0000000..d2c02b4 --- /dev/null +++ b/tags/gutenberg/index.html @@ -0,0 +1 @@ +Tag - gutenberg
Me

Posts Tagged as "gutenberg"

gutenberg init blog

When I first created this site I wanted to get it live as quickly as possible. Hexo, a blogging framework written in Node.js, seemed like the perfect tool. At the time I was rather interested in Node.js, so it seemed natural to use a framework rooted in that community.

By the time of my last post I'd become increasingly disinterested in Node.js and much more interested in Rust and its community. It was mostly procrastination, but I convinced myself that using a tool written in a language I didn't use often directly contributed to the paucity of posts here, so I finally decided to ditch Hexo.

\ No newline at end of file diff --git a/tags/hacks/index.html b/tags/hacks/index.html new file mode 100644 index 0000000..f790774 --- /dev/null +++ b/tags/hacks/index.html @@ -0,0 +1 @@ +Tag - hacks
Me

Posts Tagged as "hacks"

hexo init blog

I've been wanting to start a blog for a long time now. Today I'm pulling the trigger on that with a simple hexo init blog. Well, it wasn't that simple, so I feel like it's worth talking about a few of the complications I had.

\ No newline at end of file diff --git a/tags/hexo/index.html b/tags/hexo/index.html new file mode 100644 index 0000000..1f46623 --- /dev/null +++ b/tags/hexo/index.html @@ -0,0 +1 @@ +Tag - Hexo
Me

Posts Tagged as "Hexo"

hexo init blog

I've been wanting to start a blog for a long time now. Today I'm pulling the trigger on that with a simple hexo init blog. Well, it wasn't that simple, so I feel like it's worth talking about a few of the complications I had.

\ No newline at end of file diff --git a/tags/html/index.html b/tags/html/index.html new file mode 100644 index 0000000..a9b425f --- /dev/null +++ b/tags/html/index.html @@ -0,0 +1 @@ +Tag - HTML
Me

Posts Tagged as "HTML"

A Fresh Coat of Paint

I'm starting the new year with a new job. To paraphrase a friend, "it's just moving from one $BIGCORP to another", but it's still exciting. I worked my last gig for 5 years, so I'm nervous, but also very ready to do something new. While I'm doing one new thing I might as well do another. Taking some time off between jobs has given me enough breathing room to redo my website.

\ No newline at end of file diff --git a/tags/http/index.html b/tags/http/index.html new file mode 100644 index 0000000..89c46e7 --- /dev/null +++ b/tags/http/index.html @@ -0,0 +1 @@ +Tag - HTTP
Me

Posts Tagged as "HTTP"

Resolving A DNS Issue

Haha. Get it? Resolving a DNS issue. OK, that was bad. You don't have to read anymore, but I'm SOA into this. You might even say I'm in the zone. I think it's gonna be A great read, so consider sticking around, 'cuz there's no TLD;R.

\ No newline at end of file diff --git a/tags/index.html b/tags/index.html new file mode 100644 index 0000000..b5feff2 --- /dev/null +++ b/tags/index.html @@ -0,0 +1 @@ +Tags
Me

All Tags

\ No newline at end of file diff --git a/tags/internet/index.html b/tags/internet/index.html new file mode 100644 index 0000000..c737cba --- /dev/null +++ b/tags/internet/index.html @@ -0,0 +1 @@ +Tag - Internet
Me

Posts Tagged as "Internet"

Resolving A DNS Issue

Haha. Get it? Resolving a DNS issue. OK, that was bad. You don't have to read anymore, but I'm SOA into this. You might even say I'm in the zone. I think it's gonna be A great read, so consider sticking around, 'cuz there's no TLD;R.

\ No newline at end of file diff --git a/tags/layabout/index.html b/tags/layabout/index.html new file mode 100644 index 0000000..35bf6ff --- /dev/null +++ b/tags/layabout/index.html @@ -0,0 +1 @@ +Tag - Layabout
Me

Posts Tagged as "Layabout"

Deprecating Layabout

Since Layabout launched last year it has been downloaded 5,755 times, gotten 16 stars on GitHub, been used by a Portuguese startup to teach a Haskell workshop, and received a Twitter shout-out from @roach, one of the core contributors to the official Python Slack client. During that time the official client library also got a lot better! So much better, in fact, that I've decided to deprecate Layabout.

Announcing Layabout

Today I'm announcing Layabout, my first official Python library. Layabout is a small event handling library on top of the Slack Real Time Messaging (RTM) API. You can get it right now on PyPI.

\ No newline at end of file diff --git a/tags/networking/index.html b/tags/networking/index.html new file mode 100644 index 0000000..cfead69 --- /dev/null +++ b/tags/networking/index.html @@ -0,0 +1 @@ +Tag - Networking
Me

Posts Tagged as "Networking"

avatar.png

No, not that Avatar. And not the other one either. This post is about avatar.png, a handful of lines of PHP that have inspired me for a long time.

Around 2011 or 2012 a friend of mine, Andrew Kvalheim, blew my mind when he made his Skype profile picture display the IP address of the computer I was using. It might have looked a bit like this.

Parsing TFTP in Rust

Several years ago I did a take-home interview which asked me to write a TFTP server in Go. The job wasn't the right fit for me, but I enjoyed the assignment. Lately, in my spare time, I've been tinkering with a Rust implementation. Here's what I've done to parse the protocol.

node.example.com Is An IP Address

Hello! Welcome to the once-yearly blog post! This year I'd like to examine the most peculiar bug I encountered at work. To set the stage, let's start with a little background.

\ No newline at end of file diff --git a/tags/nom/index.html b/tags/nom/index.html new file mode 100644 index 0000000..8df75b6 --- /dev/null +++ b/tags/nom/index.html @@ -0,0 +1 @@ +Tag - nom
Me

Posts Tagged as "nom"

Parsing TFTP in Rust

Several years ago I did a take-home interview which asked me to write a TFTP server in Go. The job wasn't the right fit for me, but I enjoyed the assignment. Lately, in my spare time, I've been tinkering with a Rust implementation. Here's what I've done to parse the protocol.

\ No newline at end of file diff --git a/tags/parsing/index.html b/tags/parsing/index.html new file mode 100644 index 0000000..7ca7a90 --- /dev/null +++ b/tags/parsing/index.html @@ -0,0 +1 @@ +Tag - Parsing
Me

Posts Tagged as "Parsing"

Parsing TFTP in Rust

Several years ago I did a take-home interview which asked me to write a TFTP server in Go. The job wasn't the right fit for me, but I enjoyed the assignment. Lately, in my spare time, I've been tinkering with a Rust implementation. Here's what I've done to parse the protocol.

\ No newline at end of file diff --git a/tags/pgp/index.html b/tags/pgp/index.html new file mode 100644 index 0000000..a260976 --- /dev/null +++ b/tags/pgp/index.html @@ -0,0 +1 @@ +Tag - PGP
Me

Posts Tagged as "PGP"

git filter-wat

Welcome to this year's annual blog post!

I've been signing git commits for my dotfiles repository since its inception in October of last year, so I was excited to see that GitHub recently added GPG signature verification. All you have to do is upload your public key to GitHub and you'll be verifying commits like a champ. Or so I thought…

\ No newline at end of file diff --git a/tags/php/index.html b/tags/php/index.html new file mode 100644 index 0000000..8052b90 --- /dev/null +++ b/tags/php/index.html @@ -0,0 +1 @@ +Tag - PHP
Me

Posts Tagged as "PHP"

avatar.png

No, not that Avatar. And not the other one either. This post is about avatar.png, a handful of lines of PHP that have inspired me for a long time.

Around 2011 or 2012 a friend of mine, Andrew Kvalheim, blew my mind when he made his Skype profile picture display the IP address of the computer I was using. It might have looked a bit like this.

\ No newline at end of file diff --git a/tags/png/index.html b/tags/png/index.html new file mode 100644 index 0000000..7fe4a6a --- /dev/null +++ b/tags/png/index.html @@ -0,0 +1 @@ +Tag - PNG
Me

Posts Tagged as "PNG"

avatar.png

No, not that Avatar. And not the other one either. This post is about avatar.png, a handful of lines of PHP that have inspired me for a long time.

Around 2011 or 2012 a friend of mine, Andrew Kvalheim, blew my mind when he made his Skype profile picture display the IP address of the computer I was using. It might have looked a bit like this.

\ No newline at end of file diff --git a/tags/python/index.html b/tags/python/index.html new file mode 100644 index 0000000..c9572c6 --- /dev/null +++ b/tags/python/index.html @@ -0,0 +1 @@ +Tag - Python
Me

Posts Tagged as "Python"

node.example.com Is An IP Address

Hello! Welcome to the once-yearly blog post! This year I'd like to examine the most peculiar bug I encountered at work. To set the stage, let's start with a little background.

Deprecating Layabout

Since Layabout launched last year it has been downloaded 5,755 times, gotten 16 stars on GitHub, been used by a Portuguese startup to teach a Haskell workshop, and received a Twitter shout-out from @roach, one of the core contributors to the official Python Slack client. During that time the official client library also got a lot better! So much better, in fact, that I've decided to deprecate Layabout.

Announcing Layabout

Today I'm announcing Layabout, my first official Python library. Layabout is a small event handling library on top of the Slack Real Time Messaging (RTM) API. You can get it right now on PyPI.

Stateful Callbacks in Python

If you're unfamiliar with what a callback is, don't worry, we can sort that out quickly. If callbacks are old hat for you you might want to skip to the interesting bit.

Simply put, a callback is a function that is passed as an argument to another function which may execute it.

\ No newline at end of file diff --git a/tags/rust/index.html b/tags/rust/index.html new file mode 100644 index 0000000..674172c --- /dev/null +++ b/tags/rust/index.html @@ -0,0 +1 @@ +Tag - Rust
Me

Posts Tagged as "Rust"

avatar.png

No, not that Avatar. And not the other one either. This post is about avatar.png, a handful of lines of PHP that have inspired me for a long time.

Around 2011 or 2012 a friend of mine, Andrew Kvalheim, blew my mind when he made his Skype profile picture display the IP address of the computer I was using. It might have looked a bit like this.

Parsing TFTP in Rust

Several years ago I did a take-home interview which asked me to write a TFTP server in Go. The job wasn't the right fit for me, but I enjoyed the assignment. Lately, in my spare time, I've been tinkering with a Rust implementation. Here's what I've done to parse the protocol.

gutenberg init blog

When I first created this site I wanted to get it live as quickly as possible. Hexo, a blogging framework written in Node.js, seemed like the perfect tool. At the time I was rather interested in Node.js, so it seemed natural to use a framework rooted in that community.

By the time of my last post I'd become increasingly disinterested in Node.js and much more interested in Rust and its community. It was mostly procrastination, but I convinced myself that using a tool written in a language I didn't use often directly contributed to the paucity of posts here, so I finally decided to ditch Hexo.

\ No newline at end of file diff --git a/tags/slack/index.html b/tags/slack/index.html new file mode 100644 index 0000000..22f4bdf --- /dev/null +++ b/tags/slack/index.html @@ -0,0 +1 @@ +Tag - Slack
Me

Posts Tagged as "Slack"

Deprecating Layabout

Since Layabout launched last year it has been downloaded 5,755 times, gotten 16 stars on GitHub, been used by a Portuguese startup to teach a Haskell workshop, and received a Twitter shout-out from @roach, one of the core contributors to the official Python Slack client. During that time the official client library also got a lot better! So much better, in fact, that I've decided to deprecate Layabout.

Announcing Layabout

Today I'm announcing Layabout, my first official Python library. Layabout is a small event handling library on top of the Slack Real Time Messaging (RTM) API. You can get it right now on PyPI.

\ No newline at end of file diff --git a/tags/slow/index.html b/tags/slow/index.html new file mode 100644 index 0000000..4f57b48 --- /dev/null +++ b/tags/slow/index.html @@ -0,0 +1 @@ +Tag - Slow
Me

Posts Tagged as "Slow"

Resolving A DNS Issue

Haha. Get it? Resolving a DNS issue. OK, that was bad. You don't have to read anymore, but I'm SOA into this. You might even say I'm in the zone. I think it's gonna be A great read, so consider sticking around, 'cuz there's no TLD;R.

\ No newline at end of file diff --git a/tags/testing/index.html b/tags/testing/index.html new file mode 100644 index 0000000..7506580 --- /dev/null +++ b/tags/testing/index.html @@ -0,0 +1 @@ +Tag - Testing
Me

Posts Tagged as "Testing"

node.example.com Is An IP Address

Hello! Welcome to the once-yearly blog post! This year I'd like to examine the most peculiar bug I encountered at work. To set the stage, let's start with a little background.

\ No newline at end of file diff --git a/tags/tftp/index.html b/tags/tftp/index.html new file mode 100644 index 0000000..934a178 --- /dev/null +++ b/tags/tftp/index.html @@ -0,0 +1 @@ +Tag - TFTP
Me

Posts Tagged as "TFTP"

Parsing TFTP in Rust

Several years ago I did a take-home interview which asked me to write a TFTP server in Go. The job wasn't the right fit for me, but I enjoyed the assignment. Lately, in my spare time, I've been tinkering with a Rust implementation. Here's what I've done to parse the protocol.

\ No newline at end of file diff --git a/tags/webdev/index.html b/tags/webdev/index.html new file mode 100644 index 0000000..1bdf2e9 --- /dev/null +++ b/tags/webdev/index.html @@ -0,0 +1 @@ +Tag - webdev
Me

Posts Tagged as "webdev"

avatar.png

No, not that Avatar. And not the other one either. This post is about avatar.png, a handful of lines of PHP that have inspired me for a long time.

Around 2011 or 2012 a friend of mine, Andrew Kvalheim, blew my mind when he made his Skype profile picture display the IP address of the computer I was using. It might have looked a bit like this.

A Fresh Coat of Paint

I'm starting the new year with a new job. To paraphrase a friend, "it's just moving from one $BIGCORP to another", but it's still exciting. I worked my last gig for 5 years, so I'm nervous, but also very ready to do something new. While I'm doing one new thing I might as well do another. Taking some time off between jobs has given me enough breathing room to redo my website.

\ No newline at end of file