Peter Hajas

Peter smiling against a blue background

Hi there! I'm Peter, and this is my website. I like to write about things I care about: my dog, statistics about my life, and fun problems I run into. For more, check out the about page, or read some of the posts below.

Visualizing Pokémon Red and Blue Connections
April 14, 2020 •  😺🗺

I love Pokémon. My fascination with the series began with the original games released in the US, Pokémon Red and Blue (I had the Blue version).

In the past few years, people have been disassembling Pokémon games. You can check these out for Pokémon Red and Blue, Crystal, Emerald, and others. It’s really cool to be able to compile and build a game that was such a huge part of my youth.

I thought it would be fun to play with this source code, viewing these games through a new lens. A few months ago, I discovered Graphviz, a software package for rendering graphs written in the Dot language. Dot is a very simple language, and it’s easy to filter data into its format. Graphviz includes some command line tools that can render dot files to nice human-readable output. Let’s see how we can use Graphviz to visualize Pokémon Red and Blue.

Inside of pokered, there’s a data directory with a mapHeaders subdirectory inside. mapHeaders includes metadata about every overworld map in the game. This includes the connections between maps. For example, here is the metadata for Route 10:

$ cat Route10.asm 
    db OVERWORLD ; tileset
    db ROUTE_10_HEIGHT, ROUTE_10_WIDTH ; dimensions (y, x)
    dw Route10_Blocks ; blocks
    dw Route10_TextPointers ; texts
    dw Route10_Script ; scripts
    db SOUTH | WEST ; connections
    WEST_MAP_CONNECTION ROUTE_10, ROUTE_9, 0, 0, Route9_Blocks
    dw Route10_Object ; objects

So south of Route 10 is Lavender Town, and west is Route 9. We can use this connection data and some simple uses of grep and awk to generate Dot code representing these connections. The following commands are all run from /data/mapHeaders in the pokered repository. First, we use grep to see the connections:

$ grep -R “MAP_CONNECTION” ./
.//PewterCity.asm:  SOUTH_MAP_CONNECTION PEWTER_CITY, ROUTE_2, 5, 0, Route2_Blocks
.//PewterCity.asm:  EAST_MAP_CONNECTION PEWTER_CITY, ROUTE_3, 4, 0, Route3_Blocks

Next, let’s pipe that to awk to print the endpoints of that connection:

$ grep -R “MAP_CONNECTION” ./ | awk -F” ” ‘{ print $3 $4; }’

We can use a second awk invocation to print these as Dot edges:

$ grep -R “MAP_CONNECTION” ./ | awk -F” ” ‘{ print $3 $4; }’ | awk -F”,” ‘{ print $1” – “$2 }’

Undirected Dot edges are represented with the two nodes and a between them.

We can represent a strict (one connection between nodes) graph (non-directed, as these are bidirectional connections) by wrapping all these connections in a strict graph {}:

strict graph {

We can add two other options (overlap=false to avoid edge overlap and splines=true to use splines for edges) to get a better looking graph. Here’s my generated from the above steps:

strict graph {
    ROUTE_9 – ROUTE_10
    ROUTE_20 – ROUTE_19
    ROUTE_22 – ROUTE_23
    ROUTE_23 – ROUTE_22
    ROUTE_24 – ROUTE_25
    ROUTE_18 – ROUTE_17
    ROUTE_19 – ROUTE_20
    ROUTE_25 – ROUTE_24
    ROUTE_14 – ROUTE_15
    ROUTE_14 – ROUTE_13
    ROUTE_15 – ROUTE_14
    ROUTE_17 – ROUTE_16
    ROUTE_17 – ROUTE_18
    ROUTE_16 – ROUTE_17
    ROUTE_12 – ROUTE_13
    ROUTE_12 – ROUTE_11
    ROUTE_13 – ROUTE_12
    ROUTE_13 – ROUTE_14
    ROUTE_11 – ROUTE_12
    ROUTE_10 – ROUTE_9
    ROUTE_4 – ROUTE_3
    ROUTE_3 – ROUTE_4

We can use a simple invocation of neato to produce a PDF file with:

neato -Tpdf > pokemon_rb_towns_and_routes.pdf

Check it out:

A graph visualizing all towns and routes in Pokémon Red and Blue (PDF file here)

OK, so towns and routes are cool. Can we augment this file to include buildings, tunnels, and rooms? There are warp and warp_to markers in the files in /data/mapObjects. For example, let’s look at SaffronCity.asm:

$ cat data/mapObjects/SaffronCity.asm
    db $f ; border block

db 8 ; warps
warp 7, 5, 0, COPYCATS_HOUSE_1F
warp 26, 3, 0, FIGHTING_DOJO
warp 34, 3, 0, SAFFRON_GYM
warp 13, 11, 0, SAFFRON_PIDGEY_HOUSE
warp 25, 11, 0, SAFFRON_MART
warp 18, 21, 0, SILPH_CO_1F
warp 29, 29, 0, MR_PSYCHICS_HOUSE


; warp-to
warp_to 18, 21, SAFFRON_CITY_WIDTH ; SILPH_CO_1F


(These _WIDTH suffixes seem to indicate the coordinates are inside of the width of the map. We’ll clean them up later.)

So, if we parse out the warp_to statements, we should be able to get a more complete view of the game’s locations and how they connect. Let’s start with a simple grep to find all the warp_to statements (run from /data/mapObjects):

grep -R “warp_to ” ./
.//RocketHideoutB4F.asm:    warp_to 19, 10, ROCKET_HIDEOUT_B4F_WIDTH ; ROCKET_HIDEOUT_B3F
.//RocketHideoutB4F.asm:    warp_to 24, 15, ROCKET_HIDEOUT_B4F_WIDTH ; ROCKET_HIDEOUT_ELEVATOR
.//RocketHideoutB4F.asm:    warp_to 25, 15, ROCKET_HIDEOUT_B4F_WIDTH ; ROCKET_HIDEOUT_ELEVATOR

Next, pipe to awk to find the endpoints of the warp:

$ grep -R “warp_to ” ./ | awk -F”,” ‘{ print $3; }’

This is close, but includes some warps that appear to point to themselves (like BIKE_SHOP_WIDTH above). No problem - we can only print lines with ; in them using grep:

$ grep -R “warp_to ” ./ | awk -F”,” ‘{ print $3; }’ | grep “;”

OK, almost done. Next, let’s strip out the _WIDTH text and put in edge connections:

$ grep -R “warp_to ” ./ | awk -F”,” ‘{ print $3; }’ | grep “;” | sed -e “s/_WIDTH//” | sed -e “s/;/–/”

(note the leading space here - not a big deal for the Graphviz tools)

Now, we’ll put this all into a file (along with the connections from to make a graph of all of the locations in Pokémon Red and Blue. For this invocation, I also used neato:

neato -Tpdf > pokemon_rb_all.pdf

This graph is so cool! Check it out:

A graph visualizing all locations in Pokémon Red and Blue (PDF file here, dot file here)

There are so many sections of this graph with interesting details, like Victory Road leading into the Indigo Plateau and Elite Four:

A cropped version of the "all locations" graph showing just Victory Road, the Indigo Plateau, and the Elite Four sections of the graph

Or the maze-like Silph Company building:

A cropped version of the "all locations" graph showing just the Silph Company building floors

I think it’s really cool how easy it is to use simple tools to see these games from a new angle. I hope to look at other aspects of these games sometime in the future.

February Trip to Death Valley
March 30, 2020 •  🏜️⛺

In February, I had the opportunity to go on a trip to Death Valley National Park with some friends. It was an awesome time! Here’s what happened:


I was lucky enough to go to REI with one of my buddies, a camping veteran, before the trip. He was also the organizer of our trip to Death Valley. At REI, I secured some supplies, including essentials (sleeping bag, freeze-dried food), things I thought were unnecessary but ended up being very useful (utensils, sleeping pad), and things I had no idea you needed or wanted on a camping trip (pillow, chair). It helps to go with people who are experienced!

The Trip Out

We left at an early 5AM towards our destination. The Death Valley area is a good 6-7 hours from the Bay Area. The drive was pretty, complete with an early morning coffee pit stop in Gilroy. Onwards…

Campsite One: The Trona Pinnacles

We arrived in our first campsite: the Trona Pinnacles. This area looks like another planet:

photo I took of Trona Pinnacles

In fact, it’s been used for just that in movie and TV sets (Star Trek V was filmed here). It was also the setting for Lady Gaga’s latest video.

This is also where fighter jets do test flights, which is really cool to see up close:

photo I took of fighter jet

The area is part of California’s BLM lands. This made it a great spot for camping and for a fire.

Along with some Ardbeg, I enjoyed some a delicious freeze dried meal: chili macaroni and cheese with beef.

Campsite Two: In The Park

After a nice oatmeal and some instant coffee, we pushed off to Trona, CA to stock up on some snacks and get gas. Trona is an industrial town in the middle of nowhere. My fiancé and I fell in love with it on our first visit. They mine borax and salt. I love this note from the Trona High School Wikipedia page:

Until several years ago an annual game was played against Boron High School. Referred to as the Borax Bowl by some, the game was a matchup of two mining towns that are world leaders in producing potash and borax, minerals used in a number of products.

While in Trona, we saw a motorcycle group with a dog that rode along. Cool!

photo of dog on motorcycle

We took off to enter Death Valley. Rather than entering through the visitor center from California, we took a side route. We saw some amazing sites on the way in to the park:

photo from Inyokern entering the park

I even saw an abandoned mineshaft - cool!

photo of abandoned mineshaft

We then arrived at our next camp site. We set up our tents, and imbibed and ate.

The night was a bit windy, but the tent I was loaned stood up just fine after some rocks were placed inside.

Campsite Three: Echo Canyon

We awoke, had some breakfast, and pushed off. We stopped off at some abandoned campgrounds and checked out a cool natural spring:

photo of natural spring

We headed off to Beatty, NV for a stop at a diner:

photo of food at diner

We had to make a game-time decision about where to stay on this third night. We had thought of staying that night in a canyon, but the park was experiencing snow (in a desert no less):

photo of snow

I was really hoping we would not have to call off our trip a day early. We decided to stay the night - I was happy we did - and headed off to Echo Canyon.

Inside the canyon, we found a nice campsite against some rocks:

photo of campsite in echo canyon

We pitched our tents, and had dinner for the evening.

As the night wore on, the wind picked up more and more. I awoke at around 4 in the morning to the wall of the tent covering my body, pushed by a strong wind. I figured I wouldn’t suffocate (I was getting barraged by oxygen), so I went back to bed.

The Trip Back

The next morning, after our customary breakfast, we packed up and headed back to civilization. We saw some beautiful views on our way out:

photo of mountains on last day photo of snow on last day

We stopped at a truck stop for some food. I had a Double Quarter Pounder with Cheese - it was delicious.

All in all, a great trip! I hope to get back soon.

Peterometer Chapter 2: Counting Raw Eggs
January 29, 2020 •  📊🥚

This is the second in the Peterometer series. The first can be found here.

After I felt comfortable tracking hydration, I wanted to introduce other metrics tracking into my life. I’ve been tracking the food I’ve eaten since March 2019. With less than a week of downtime, I’ve tracked everything I’ve eaten since that time.

I track my food in the LoseIt app. I usually log the food right after eating it, and I really enjoy seeing how many calories I’ve eaten that day.

In the past year, I’ve also become really into eating raw eggs. I enjoy them on starchy foods, like ramen or rice. Kind of like my own version of tamago kake gohan. In a given week, I’ll end up eating a few meals with raw eggs.

But how many raw eggs? Let’s find out!

Exporting LoseIt Data

LoseIt lets you export the food you’ve eaten on a weekly basis from their website. These export as a CSV file, which is perfect for processing and backup. If you’re a Premium member, go to your “Insights” tab on the website and then look for “Weekly Summary” and “Export to spreadsheet”. I export my data every 2 or 3 weeks, and collect it all in a folder on my machine. These files are numbered based on their starting day since January 1 2000:

$ ls

Generating a master food CSV

I wrote a simple shell script to combine all these summaries, remove workouts (I track these separately), and then generate a master combine.csv file with all my food in one place:

# delete the combine file
rm combine.csv
# concatenate all the csvs, sorting
echo "Date,Name,Type,Quantity,Units,Calories,Fat (g),Protein (g),Carbohydrates (g),Saturated Fat (g),Sugars (g),Fiber (g),Cholesterol (mg),Sodium (mg)" > combine.csv
cat *.csv | grep -v "Calorie Burn" | grep -v "Date,Name" | grep -v "HealthKit" | sort >> combine.csv

With this, we can cat out the combined CSV file to see the food I’ve eaten:

$ cat combine.csv
Date,Name,Type,Quantity,Units,Calories,Fat (g),Protein (g),Carbohydrates (g),Saturated Fat (g),Sugars (g),Fiber (g),Cholesterol (mg),Sodium (mg)
01/01/2020,"Sandwich, Sub, JJ Gargantuan",Lunch,1.5,Each,"1,656",82.50,108,124.50,22.50,n/a,7.50,268.50,5121

(this example highlights a problem with my use of sort in the script: 01/01/2020 sorts alphabetically before 12/01/2019)

Analyzing the food I’ve eaten

We can use some simple utilities to query the file. For example, how many food entries have I logged?

$ cat combine.csv | wc -l

Wow! That’s a lot of food. This averages to around 12 entries per day.

We can grep for other data, like how many burritos I’ve had.

$ cat combine.csv | grep -i burrito | wc -l

I’ve eaten a burrito every 9 days or so.

Counting raw eggs

But how many raw eggs have I eaten? I usually log these when I eat actual raw eggs (when I cook an egg dish on the stove without oil I will also count it as raw eggs). Let’s grep the CSV file:

$ cat combine.csv |grep -i egg
01/02/2020,Egg raw,Dinner,2.0,Servings,140,9.60,12.60,0.80,3.20,n/a,n/a,372,142
01/02/2020,Scrambled Eggs,Breakfast,1.0,Each,91,7,6,1,2,0.80,0,169,88

This grep picked up anything with “egg” in the title, like scrambled eggs. We only want to look for raw eggs - it’s simple enough to do that by modifying our grep invocation. But keep in mind the header line of combine.csv:


The fourth column is the quantity. This is how many raw eggs I actually ate. So we really want to sum that column.

We can print the fourth column with a bit of awk:

$ cat combine.csv |grep -i egg |grep -i raw | awk -F, '{ print $4; }'

and sum it with a bit more:

$ cat combine.csv |grep -i egg |grep -i raw | awk -F, '{ eggs+=$4; } END { print eggs; } '

(yes, I did eat non-whole portions of eggs some days)

276.75 raw eggs. That’s nearly two dozen dozen raw eggs!

Raw egg consumption based on day of the week

As a fun visualization, I wanted to see if I ate raw eggs more on a particular day of the week. I did this using some simple categorization in Numbers:

Raw eggs visualized by day

Wow! Based on this, I had way fewer raw eggs on Fridays. Tuesdays and Thursdays are lighter weekdays for raw egg consumption.

Looking forward to more food analysis

Counting raw egg was a fun test of the data I’ve been gathering for the past 10 months. I’m really excited about the other visualizations and analysis I can do on the food that I’ve eaten.

The Platform Master
January 4, 2020 •  🕹📺

Watched this documentary about Ulillillia, the internet figure. It follows him and the small town of Minot, ND in the aftermath of the 2011 Souris River Flood. Despite it being from 2012, it was released at the end of 2019.

While the film moved at a slower pace, I found some of Ulillillia’s quotes from it interesting. In the documentary, Ulillillia lives with his parents:

There’s very little I can do, so I don’t get much enjoyment … I definitely wish I could travel more, or see the world … it’s like I can’t get out of the cage

According to this wiki page, he has since gotten a driver’s license and moved out of his parent’s home to Florida.

Upgrading a Nintendo Switch SD card
December 21, 2019 •  🎮💾

I own a Nintendo Switch console. Because I don’t want physical games cluttering my home, I get all my games digitally. I recently started running out of room and needed a new SD card to hold my games. This path ended up being circuitous, taking me through fake flash storage, the Switch’s file structure, and its operating system’s treatment of Unix file attributes.

A new card

I purchased a Samsung 512GB SD card from Amazon for $65. I’m shocked at how cheap SD cards are now. This replaced my old 128GB card.

Here’s the old card:

The old SD card

And here’s the new card:

The new SD card

Flash memory has never been more popular. Unfortunately, there has been a recent rise of fake flash memory. These are USB keys, SD cards, and solid state disks that masquerade as a larger capacity, but will lose your data or fail to read / write properly.

After my new card arrived, I wanted to check its authenticity. Is this a real 512GB SD card? Or a fake?

The f3 tool

f3 is a tool to verify the authenticity of flash memory. It stands for “Fight Flash Fraud” or “Fight Fake Flash”. From the site:

I started this project when I bought a 32GB microSDHC card for my Android phone back in 2010, and found out that this card always fails when one fills it up.

I installed it on my Mac with brew install f3. This installed two binaries: f3write and f3read.

f3write writes contiguous 1GB files to your card until it has exhausted its space. It will also report on the write speed for this operation. You use it by running it on the path you want to write to. It will report progress when running:

$ f3write /Volumes/test
F3 write 7.2
Copyright (C) 2010 Digirati Internet LTDA.
This is free software; see the source for copying conditions.

Free space: 476.69 GB
Creating file 1.h2w ... OK!                           
Creating file 2.h2w ... OK!                           
Creating file 476.h2w ... OK!                        
Creating file 477.h2w ... OK!                       
Free space: 1.38 MB
Average writing speed: 77.17 MB/s

My card got 77MB/s, which comes out to about 2 hours of write time (this is about how long it felt). After doing the write step, you can run f3read to verify that the files made it there successfully:

$ f3read /Volumes/test
F3 read 7.2
Copyright (C) 2010 Digirati Internet LTDA.
This is free software; see the source for copying conditions.

                  SECTORS      ok/corrupted/changed/overwritten
Validating file 1.h2w ... 2097152/        0/      0/      0
Validating file 2.h2w ... 2097152/        0/      0/      0
Validating file 476.h2w ... 2097152/        0/      0/      0
Validating file 477.h2w ... 1432983/        0/      0/      0

  Data OK: 476.68 GB (999677335 sectors)
Data LOST: 0.00 Byte (0 sectors)
           Corrupted: 0.00 Byte (0 sectors)
    Slightly changed: 0.00 Byte (0 sectors)
         Overwritten: 0.00 Byte (0 sectors)
Average reading speed: 71.12 MB/s

This will read each of the 1GB files created by f3write, verifying they did not experience any data loss. This will also report on your card’s read speed. Mine was 71MB/s, which also felt like around 2 hours.

Cool, a real SD card that works! Next step: move my games over.

Switch SD file structure

Plugging my old SD card into my machine, diskutil reports an NTFS filesystem:

$ diskutil list
/dev/disk3 (external, physical):
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:     FDisk_partition_scheme                        *127.9 GB   disk3
   1:               Windows_NTFS                         127.8 GB   disk3s1

The hierarchy on the card (named “Untitled”) has a single “Nintendo” folder on it. What’s in there?

$ tree Nintendo
├── Album
│   ├── 2017
│   ├── 2018
│   └── 2019
├── Contents
│   ├── placehld
│   ├── private
│   ├── private1
│   └── registered
├── save
└── saveMeta

829 directories, 688 files

(heavily truncated)

So, it looks like: - Album, which includes all my screenshots and video recordings arranged by year - Content, which has some subdirectories: - placehld, which has some empty directories - likely placeholders given the filename - private, a 16B file which looks like it contains a key - private1, a 32B file which looks like it contains a different key - registered, which appears to contain all the games

I’ve used the majority of the card’s 128GB space:

$ du -skh /Volumes/Untitled
111G    /Volumes/Untitled

Formatting the card

Next step was to insert the 512GB card into the Switch and ask it to format. This process took about 3 seconds, then the console rebooted itself. I verified that the card was now NTFS:

$ diskutil list
/dev/disk3 (external, physical):
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:     FDisk_partition_scheme                        *512.1 GB   disk3
   1:               Windows_NTFS                         512.0 GB   disk3s1

Backing up the card

I don’t have two SD card slots on my machine, so I’ll need to copy the data first to my computer, then onto the card. We can use rsync to do this:

$ mkdir nintendo_backup
$ rsync -avz /Volumes/Untitled nintendo_backup/

I used rsync in archive mode by passing the -a flag, which according to man rsync will:

…ensure that symbolic links, devices, attributes, permissions, ownerships, etc. are preserved…

Perfect! I also passed -v for verbose, and -z to use compression as the files are moved.

Restoring to the new card

Now it’s time to restore to the new card. After unmounting the smaller card and inserting the new card, it’s a simple matter of rsyncing the files back:

rsync -avz nintendo_backup/Untitled/ /Volumes/Untitled

Trying it out

With the rsync step done, we can test out the card:

The card's not working

It doesn’t work! The Switch responds with error 2016-0247 and is unable to access the card

I then tried copying the contents of nintendo_backup over the rsync‘d backup files with Finder. Perhaps the rsync step missed something? Or my machine fell asleep?

Nope, same error. I tried the original card to verify my Switch was still functional. The original card works fine.

Debugging the issue

So, we’ve got a new SD card that the Switch won’t see:

The Nintendo support site seems to suggest similar steps to those I followed:

Although it does contain this strange warning:

Important: This process may not be able to copy the microSD card contents correctly in environments other than Windows (such as Mac).

A fix

Turns out I was missing the archive attribute on the card. Simply running sudo chflags -R arch /Volumes/Untitled to fix up the Archive attribute fixed the issue (I also did a sudo dot_clean -mn on the directory). Phew! Look at all that space:

A screenshot showing the free space on the new larger card

Why did this break?

I’m not totally sure. My hunch is that an archive attribute on the files was confusing the Switch’s SD card parsing. It looks like the Nintendo Switch’s OS is not Unix-derived, but rather an evolution of the 3DS OS. It’s possible that some file attributes from Unix may have confused the Switch.

Peterometer Chapter 1: Tracking Hydration
March 8, 2019 •  📊🚰

This is the first in a series of posts I hope to write about building tools for “Peterometer”, a way to visualize stats I’ve collected about myself.

Inspiration for Peterometer

I’ve long been inspired by people who build beautiful visualizations of their gathered metrics. For example, Nicholas Felton’s annual reports. Here is an example from his 2007 report:

The first page of Nicholas Felton’s 2007 annual report

I think this format of visualization is really cool. It’s easy to glance at while being information-dense.

Another cool example is Anand Sharma’s April Zero, which is now an app called Gyroscope:

An image of Gyroscope

He has some great posts about his creative process here and here. The Iron Man-style HUD influence shines through in the finished product.

These visualizations are really cool ways to show stats gathered about your life. I’ve recently been getting more into tracking my day-to-day life, and experimenting with visualization techniques to showcase this data.

Hydration Tracking

Since December 2018, I’ve been tracking my hydration every day. Every time I finish drinking something, I log the type of drink it was, and how much of it I drank. I log the data using the WaterMinder app on my watch and phone. WaterMinder lets you have saved drinks, which is really helpful if you drink the same thing often (my Nalgene, a cup of coffee from the machine at work, etc.)

Some Visualizations

WaterMinder lets you export your data in CSV format. With a little bit of Python, we can parse this into a “stream” stacked area chart to show how I hydrate myself:

A stacked area plot of my hydration

This chart is a bit tough to read due to data density. The legend is sorted in descending order of consumption. By the numbers, this is:

Drink fluid oz.
Water 4159.7
Coffee 1482.7
Soda 826.9
Sports Drink 761.9
Tea 751.4
Smoothie 398.0
Carbonated Water 381.4
Energy Drink 273.6
Beer 225.0
Protein Shake 162.0
Liquor 130.0
Juice 94.0
Coconut Water 81.6
Milk 55.1
Wine 17.6
Hot Chocolate 15.0
Total 9815.9

Using the chart and table there are some interesting takeaways: - I drink a lot of coffee. Almost 44 liters during this period! - There’s coconut water consumption in late December - I had this while skiing - In early February, there are a few days with only water, smoothies, and some soda. This corresponds to when I had my wisdom teeth removed - I have energy drinks on the same day as protein shakes. This matches reality - I have a pre-workout energy drink and protein shake on days that I lift - I seldom have beer, but I have more beer than I have protein shakes - I infrequently have wine and liquor. This is good! - You can see wine consumption during the New Years period - Most liquor consumption was during vacations

Analyzing My Dog's DNA
January 21, 2019 •  🐕🧬

Dogs are awesome. Since March of 2014, I’ve owned a little dog named Riker:

Riker the small dog outside

I adopted her from the Humane Society in Milpitas. She was 8 weeks old. She recently had her fifth birthday. Happy birthday Riker! I love her very much.

Anyways, when I got her, I was curious about what type of dog she was. She looks like a chihuahua, but has some features that don’t match the traditional chihuahua:

I thought she might have been part Shiba Inu. These dogs are adorable and have the curled tail and gradient-banded coat. Here’s one from Wikipedia:

A shiba inu dog outside on grass

Doesn’t this dog look like Riker? With the tail and banded coat? I thought so too.

Whenever other people asked me about Riker, I would tell them she’s “part chihuahua, part shiba inu”. They believed me, but I didn’t have any proof.

Enter Embark

A good friend of mine told me about a service where you can get your dog’s DNA analyzed. It’s called EmbarkVet (disclaimer: this is a referral link). Embark gives your dog a breed report, health analysis, and some cool genetic information. It’s like 23andMe for your dog.

I signed up for the service. A few days later they sent me a DNA swab kit. It’s like an oversized q-tip. After sticking this in Riker’s cheeks for 30 seconds or so (she wasn’t happy about this, but I told her it was for science), I packed the kit in the return shipping and sent it off.

Over the next few weeks, Embark notified me as the analysis progressed. After 3 weeks or so, my pressing question was answered: they told me what type of dog she was!

Less than 50% chihuahua

To my absolute amazement, Riker is not even mostly chihuahua! The results confirm that Riker is a pomeranian! The full breakdown is:

I also learned that she has 1.5% “Wolfiness”, which indicates ancient wolf genes that have survived through to modern domesticated dogs. Cool!

Note: I am not a trained biologist. Regrettably, I spent much of my college biology course playing Pokémon Crystal, although I do remember some parts of the genetics sections.

A family tree

Embark also determines haplotypes, which means they can tell what genes of Riker’s came from which parents. This lets them generate a family tree:

Riker's family tree

This means Riker’s parents were a chihuahua mix and a purebred pomeranian.

Analyzing my dog’s DNA

Besides finally being able to see my dog’s genetic history, I was super happy that I could download the raw genetic data. When I unzipped the file, I was left with two text files:

riker.tfam // 75B
riker.tped // 6.7MB

(I renamed them for brevity)

I searched for information on these file types. They both seem to be related to the free PLINK genetics software package.

If we download PLINK and run it on these files, we get:

$ plink --tfile riker
Processing .tped file... 77%
Error: Invalid chromosome code '27' on line 166046 of .tped file.
(This is disallowed for humans.  Check if the problem is with your data, or if
you forgot to define a different chromosome set with e.g. --chr-set.)

This is pretty cool! PLINK defaults to human DNA. This file is from a dog, not a person. Looking through the plink --help file, we can see that they have support for lots of species:

--cow/--dog/--horse/--mouse/--rice/--sheep : Shortcuts for those species.

(I wonder why rice is so interesting to the software…)

Anyways, let’s run it with the --dog flag:

$ plink --dog --tfile riker
Processing .tped file... done.
plink.bed + plink.bim + plink.fam written.

PLINK wrote these files to the same directory:

$ ls

These look to be PLINK metadata files that the software uses to do its processing.

We can use PLINK to do some interesting sounding genetic computations. For example, if we run it with the --homozyg flag, we can see homozygosity reports. According to Wikipedia (I must have been in Johto during this part of biology), zygosity is “the degree of similarity of the alleles for a trait in an organism”. Running it produces:

$ plink --dog --tfile riker --homozyg
1 dog (0 males, 1 female) loaded from .fam.
--homozyg: Scan complete, found 27 ROH.
Results saved to plink.hom + plink.hom.indiv + plink.hom.summary .

The software knows that the dog is female, which is pretty cool. The files it generated seem to indicate the degree of homozygosity for her individual genes. Neat!

If we run with the --het flag, we can see inbreeding coefficients. The file it produces show this:

O(HOM)    E(HOM) N(NM)  F
     0 3.051e+04 61022 -1

From this helpful documentation I found online:

O(HOM)    Observed number of homozygotes
E(HOM)    Expected number of homozygotes
N(NM)     Number of non-missing genotypes
F         F inbreeding coefficient estimate

-1 looks like it indicates a sampling error or contamination according to the docs:

The estimate of F can sometimes be negative. Often this will just reflect random sampling error, but a result that is strongly negative (i.e. an individual has fewer homozygotes than one would expect by chance at the genome-wide level) can reflect other factors, e.g. sample contamination events perhaps.

We can also use the software to find what parts of the genotyping are missing with the --missing flag. From this, I was able to gather that Riker only has a missing SNP rate of 0.0009034 (less than 1%). I think this means that Riker’s DNA in this sample is over 99% complete. Cool!

I may make Riker’s DNA available online some day for others to do genetic analysis.

Anyways, I thought this was a fun exercise into how genetic data is stored and processed. It’s really cool that there is open source software to analyze this data. Thanks for reading!

Live Markdown previews from vim
January 6, 2019 •  📝💻

This is more of a technical post

This site is written using Markdown, which:

allows you to write using an easy-to-read, easy-to-write plain text format, then convert it to structurally valid XHTML (or HTML)

It’s a common technique for writing blogs, as it lets the author write text in the text editor of their choice.

I write this page in vim, which is a command line text editor. I love vim for many reasons that are outside the scope of this post. I’ve been using it as my primary editor for almost 5 years. I wanted a good solution for writing Markdown in vim.

Because vim is a command line editor, it makes it tougher to use it to write rich text. In a GUI program, rich text can be rendered inline and as-you-edit. A command line tool can show you the formatting you have applied to text, but it does not give you an accurate rendering of how your content will look as you edit it.

I wanted to continue to use vim to edit my blog posts (I find vim plugins for other editors to be a worse approximation than “I can’t believe it’s not butter”). This post is about how I solved this problem.

A Markdown preview app

On macOS, you can get apps to render Markdown in a window. One that I like a lot is Marked, which automatically refreshes its preview when the Markdown file changes. This means you can open a .md file in Marked and get live updates as you change it. This can be used with any editor, including vim. The flow is something like:

As you save to it with :w, the preview will be auto-refreshed. I like this, but wanted to open it with a single key combination.

Adding Marked to vim

We can write a vim function to help us with this. Something like this:

function OpenInMarked2()
    " Open the file in Marked here

Running a shell command from vim

We can use ! to run a command from our function. Technically, this is the filter command. From :help !:

!{motion}{filter} Filter {motion} text lines through the external program {filter}.

You can try this in your vim. Just run :!pwd, and you should be presented with your shell and your current working directory. Below it, vim helpfully tells us “Press ENTER or type command to continue”.

This is great! Using this and the macOS open command, we can at least launch Marked 2 with something like this:

!open -a Marked\

But what about opening the current file? vim lets us use % from Ex commands as a stand-in for the path to the current file. So we can do something like:

!echo %

to see the current file. We can use this with open to open the current file in vim:

!open -a Marked\ %

This is the body of our OpenInMarked2 function:

function OpenInMarked2()
    !open -a Marked\ %

Once we have this function, we can call it with the Ex call command, like this:

:call OpenInMarked2()

This will open the current file in Marked, but will leave us at the shell with “Press ENTER or type command to continue” there. We still have to press enter before we can edit the file live.

Defining a mapping

We can use a vim mapping to call the function for us. A mapping lets us take a key combination and map it to a command. For the key combination, I like to use <leader>, which is a user-controlled modifier in vim. This means that <leader> _ is wide open for use in your .vimrc. For my leader key, I use , (comma), as it’s near the other modifiers on a US keyboard. For leader mappings, I like a mnemonic key combination, as it makes it easier to remember.

A good choice for this mapping, which is applicable for Markdown and for opening Marked, is <leader>m. Our mapping to call the function might look like this:

nmap <silent> <leader>m :call OpenInMarked2()

Here’s a decoder ring for what these commands mean:

If you add this mapping and function to your .vimrc, you’ll see that it works kind of. You have to hit enter to get the command to actually run, and then after doing so you’re left at the shell, so you need to hit enter again before returning to the editor.

Hitting enter twice

We can add a press of the enter key to our mapping using <CR>. This acts as though the user has pressed the carriage return. We’ll add two of them, as we need one to run the function and another to return to the editor. Our mapping now looks like this:

nmap <silent> <leader>m :call OpenInMarked2() <CR> <CR>

which is the mapping you’ll find in my .vimrc.

I hope you found this post helpful. I have been using OpenInMarked2 to help me write Markdown for 2 years now, and I think it’s a cool example of how to extend vim to fit your needs.

New Site, Who Dis?
December 31, 2018 •  🆕🌎

This is the first post on my new website! You’re reading this on the brand new Let’s break it down.

A (not so new) domain

It’s worth noting that this page is also hosted on I have had this domain from American Samoa for some time. I love the idea of domains that are people’s names with no added alphanumeric characters, and this one fits the bill for me.

My own static site generator - what could possibly go wrong?

My old page was powered by Hyde. While this project provides a lot of flexibility, I felt that I was not using all of its features. I wanted something dead-simple that could enable me to write more. As a result, I wrote my own.

This site is still statically-generated from Markdown, but it’s now done with a Bash script with only one dependency (a markdown binary). This makes it easier to set up on a new machine. It also lets me really understand how the site works.

The site generates quickly, and it’s been a great learning opportunity! For example, I learned how to format dates using date (see here). I’m still learning (I don’t know web layout very well, for example) but I think it’s better to jump in with both feet when learning something new.

The site is missing some stuff (like RSS support), but I plan to iterate on it over time.

Hopefully I’ll write more

All of this is for naught if I don’t actually use the damn thing to write and think out loud.

I hadn’t updated my old webpage in over 6 years. There’s so much that has happened in that time that I want to write about, like:

Fingers crossed I’ll write about these things in the (near) future on this page.