bushi blog

Subtitle Website Generation

2022-11-22T00:00:00-06:00

This is the process I went through to create a way to view subtitles on a website, generated from an srt file. I did this to be able to watch Evangelion on Laser Disc, which doesn’t have english sub or dub. You can see the result here.

Extract Subtitles

Use ffmpeg to extract subs encoded with video. Files often have multiple subs, so we have to figure out which we want. You can inspect the file with ffmpeg, or just export each stream and check them.

Files are mounted on a network drive, so we’re doing it in powershell, though the ffmpeg command is pretty portable.

Get-ChildItem –Path "./" | Foreach-Object { ffmpeg -i .\$_ -map 0:s:1 ./subtitles/$_.srt }

Once we have the sub files, we need to prepare them for templating, which means formatting.

Formatting for Jekyll

I use jekyll for this blog, so we need to convert the sub files into CSV so we can template off the values.

Luckily, the srt format is super simple. Basically just index, time range, text, and an empty line.

Using sed and awk, we can massage it into the format we want. To prepare for storing in a csv, we need to double up any " marks so they’re ignored. After exporting the sub files, I moved them to WSL so we can use bash now.

for f in ./*.srt ; do;
	# print column titles
	echo "index,timecode,text" > "${f%.srt}.csv";
	# process subtitle file
	cat $f | 
	# replace single quote with double 
	sed 's/"/""/g' | 
	awk 'BEGIN{RS="";FS="\n"}{print $1 "," "\""  $2 "\"" "," "\""$3 $4 $5 "\"" }'
	>> "${f%.srt}.csv";
done;

Or a monster one-liner:

for f in ./*.srt ; do; echo "index,timecode,text" > "${f%.srt}.csv"; cat $f | sed 's/"/""/g' | awk 'BEGIN{RS="";FS="\n"}{print $1 "," "\"" $2 "\"" "," "\""$3 $4 $5 "\"" }' >> "${f%.srt}.csv"; done;

After running this, I needed to make a few more changes to the original srt files. Some entries have empty lines in the subtitle text part, which screw up this parser. Instead of trying to work around it, I just found all the places that matched the regex \n\n[^\d] and fixed them manually.

Jekyll include page

Next we need to setup a reusable include page that we just feed csv data to output all the subtitles in a table. We jsut need to pass the episode title and reference the episode csv data in a way jekyll likes. I came up with:

{ include.title }}

 style="font-size: 1em;">
	
		Timecode
		Text
	
	% for entry in include.srtdata %}
	
		{ entry.timecode }}
		{ entry.text}}
	
	% endfor %}

Styling

A new problem that arose was styling The subtitles are meant to be displayed on a screen, so a lot of the formatting isn’t exactly website friendly. The original .srt formatting used tags, which aren’t in html5. Simply used crtl+shift+f to replace those and fix the styles, including color and font size.

Normalized it to a few common values, since it will just be displayed on a web page.

Subtitle Page

Now we create a page that uses are includes file to display subtitles. I opted to display all the subtitles for a series on a single page, so I only have to create a single page. There’s no easy way to generate sets of pages, so a page per series isn’t too much to ask.

It ended up just being:

---
layout: default
title: Home
---

%- include subtitles.html title="01 Angel Attacks" srtdata=site.data.eva_subs.eva_01_Angel_Attacks -%}
%- include subtitles.html title="02 Unfamiliar Ceiling" srtdata=site.data.eva_subs.eva_02_Unfamiliar_Ceiling -%}
etc...

Encryption and Leaking Information

2022-11-20T00:00:00-06:00

Things can be encrypted, but encryption can be done many different ways, to varying effectiveness. Some methods can preserve things like filenames and hashes while the only the raw data of the file is encrypted. The files are still “encrypted”, but often metadata the files can still be found. While encryption can secure a message being communicated, it is only one piece of the puzzle. While the content of the message could be considered secure, there is a variety of other information generated that can provide additional bits of information.

The enigma machine was cracked in part because each day the weather was communicated with the same header, opening it to a partial known plaintext attack. The enigma algorithm shuffled on a letter-by-letter basis, meaning repeated strings in the plaintext could be exploited to learn clues about the encryption settings. Modern algorithms usually encrypt data in blocks, which just moves the goal posts from analyzing individual letters to single blocks of data.

A perfect encryption algorithm or method (hopefully) ensures the plaintext is secure, but other information surrounding how the message is communicated can be derived There are two things in particular that create metadata that is not usually considered.

Size

The size of the message is important, even if the encryption method using padding. Long messages are more likely to contain more text than a short message, and can leak information about the content of the message.

Because of this, all messages sent between two parties should always be padded to the same size, so there is no discernable difference between the ciphertexts of “A OK” and “Danger, run away”. To blend in, messages should be padded to blend in with other messages being sent by the system.

Time

The time a message is sent can be linked to other events, and leak information. If a message is always sent before a separate event takes place, it’s not too much to expect an attacker to link the two together. Since it’s not possible to control when a message needs to be sent, the trick is to be sending messages constantly.

When a message is actually sent, the cipher text will decode to something legible (padded to a standard size). If the user has no message to send, random garbage is sent instead. This stops an attacker from being able to tell if a real or fake message is being sent.

Notes on IPFS

2022-11-14T00:00:00-06:00

# add file to ipfs
echo "Hello, world!" > hello.txt
ipfs add hello.txt

# publishes using node public key
ipfs name publish 

# edit file
echo "Hello IPFS!" > hello.txt

# add again, then publish
ipfs add hello.txt

ipfs name publish 

# View file at 
# https://gateway.ipfs.io/ipns/k51qzi5uqu5dkkciu33khkzbcmxtyhn376i1e83tya8kuy7z9euedzyr5nhoew

initial setup process

generate new key
- /api/v0/key/gen
Import any other keys
- /api/v0/key/import
Add other nodes
- /api/v0/swarm/peering/add
add remote pinning services (optional)
- /api/v0/pin/remote/service/add
publish ipns address for item
- /api/v0/name/publish

normal operation

Pin item
- /api/v0/pin/add
Verify pins
- /api/v0/pin/verify
export pins
- /api/v0/pin/ls
grab list of peers
- /api/v0/swarm/peering/ls

Advantageous encoding

2021-11-15T00:00:00-06:00

Information is ethereal, and “disappears” with nothing to contain it. The method in which information is stored can not only influence our interpretation of the information, but what we are able to do with it. Special storage formats can allow us to more easily glean additional information from what is stored.

Following is a series of interesting or novel storage/encoding methods, in order of confidence.

Binary / Base 2

We take numbers (2, 7, 2893479283749283) and encode them using 1 and 0 (on/off). This format gives us a few benefits that base 10 numbers don’t.

Easily tell if number is odd or even by looking at least significant bit (0x11 = odd, 0x10 = even).
Easily multiply/divide by 2 by shifting bits left/right. Faster than using the multiplier instruction (for integers at least)
Data compression, can represent much larger numbers with less characters

`0x0010010`	`18`	x	-
`0x0001001`	`9`	x/2	»
`0x0100100`	`36`	x*2	«

Polish notation

Way of encoding mathematical operations so that it fits much better into a stack. As a result, made it much easier for early calculators to parse and perform math operations. It’s also special in that the order of operations is determined by the order of

Normal	Polish
3+4	+34
(3+2)-5/6	-+32/56

https://prettyboytellem.com/writing/2019/Apr/HP-50G%20and%20RPN.html

Homomorphic Encryption

This is a special method of encrypting data, such that mathematical operations can be performed on the encrypted data without needing to decrypt.

An example could be a datastore of population statistics. Homomorphic encryption would allow researchers performed queries against the data and not leak private information (if configured correctly).

Other methods of encoding to perform specific mathematical operations.

Types[^1]:

Partially homomorphic encryption encompasses schemes that support the evaluation of circuits consisting of only one type of gate, e.g., addition or multiplication.
Somewhat homomorphic encryption schemes can evaluate two types of gates, but only for a subset of circuits.
Leveled fully homomorphic encryption supports the evaluation of arbitrary circuits composed of multiple types of gates of bounded (pre-determined) depth.
Fully homomorphic encryption (FHE) allows the evaluation of arbitrary circuits composed of multiple types of gates of unbounded depth, and is the strongest notion of homomorphic encryption.

https://github.com/Microsoft/SEAL

https://people.csail.mit.edu/vinodv/FHE/FHE-refs.html

Floating point separation and computation

Encode each side of the decimal point as a separate integer, then perform the operations on them separately before joining them back together into the floating point result.

Numbers less than 0 have different “behaviors” than integers,

Difficult math operations: roots and squares

change the format based on the value?

Base 2 numbers allow us to square the number 2 easily.

2^3 == 8 == 0x10 shift left twice -> 0x1000

16 ^ 2 == 256 == 0x10h shift left once -> 0x100h

encode number n in base k to get e
To compute n * k, shift e left once

Goal: comput 352 * 16

1.	Have math you want to perform	`n * k`	`352 * 16`
2.	Convert `n` to base `k` format	`n' * k`	`0x160h * 16`
3.	Shift `n'` once to multiply by `k`	`n' * k => n' << 1`	`0x160h << 1 => 0x1600h`
4.	Convert back to the target base (usually base 10)		`0x1600h => 5632`

To multiply two numbers, reduce number to primes.

2253 * 18 => 2253 * (3 * 3 * 2)

Perform each conversion individually, using the output of the previous conversion.

* (3 * 3 * 2)
* (3 * 2)
* (2)
== 2253 * 18

Simple to see for small numbers, also probably faster than converting back and forth. Good for very large numbers, where it could be difficult to multiply together due to hardware constraints (numbers larger than registers).

The conversion base would usually be prime numbers, but it would also work with composites. Converting with base 4 could be used to perform the same operations as base 2, but twice as fast. Because 4 is twice 2, half as many shifts need to be made. To multiply by 4, you could either convert to base 4 and shift once or convert to base 2 and shift twice.

This could be used to optimize calculations, especially if the numbers are extremely large. Large bases like 64 might be nice so you don’t have to repeat base 2 shifts 5 times, just a base 64 shift once. I assume that smaller bases perform less “work”, but number can more easily be converted to and from that base. It’s much quicker to convert a number to base 2 than base 64, especially because of hardware properties.

Prime notation

This would be an encoding that would signal if a number is prime or not. Encoded to tell if a number is prime by the way it is encoded. It would be similar to determining wether a binary-encoded number is odd or even based on the least significant bit.

Factor notation

Encode numbers so that the factors of a number are included. This would allow a number to be factored much quicker because the hard work has already been done, a program simply needs to perform a lookup based on the number. While not that useful for smaller numbers, it may prove useful for operations performed on larger numbers.

“Extra information” storage, meta-encoding

When storing values, pre-calculate properties about the value being stored, depending on it’s usage.

Store information like the value’s factors, version of the value so you don’t have to calculate them later, or make any calculations you run with the value perform quicker.

If you’re able to make a good guess about calculations that will take place over a set of data, then you can potentially start calculating them before it’s needed.

A good example could be yearly financial records for a company. Once the year is over, no additional data is being added for that year. The company will want to use that data in the future to analyze how they performed and finding takeaways from the data. At a minimum, different metadata could be generated to identify trends, averages, and other statistical information.

Storing the metadata alongside the raw data makes it much easier for a “consumer” program to analyze the raw data and take shortcuts generated by the metadata. By pre-computing on the data, we can save that from needing to be computed again in the future, as long as the calculations required are the same.

Other Computing shortcuts

http://www.lomont.org/papers/2003/InvSqrt.pdf

Takes advantage of how floating point numbers are stored combined with magic number 0x5f3759df to calculate an inverse square root - 1/sqrt(x).

Kindle 2 (2009) Review

2021-09-22T00:00:00-05:00

This is a review of the Amazon Kindle 2, which was originally released in 2009. Recently, I’ve been using it to read though my backlog of books and wanted to get down my thoughts on the device.

Overall, it is a very usable e-reader. Once the books are on the device, it isn’t too difficult to read them. The real problem I’ve had with it is getting books onto it, and fighting through the arcane design decisions Amazon made.

Specs from Wikipedia:

OS	2.5.6
CPU	Freescale 532 MHz, ARM-11 90 nm processor
Memory	32MB
Storage	2GB / 1.4 Usable
Battery	3.7V 1,530 mAh
Display	6-inch e-ink, 600 x 800, 167 PPI, 16-level grayscale
Input	MicroUSB, 3.5mm headphone hack, keyboard
Connectivity	Wi-Fi (BG), Bluetooth
Dimensions	203mm x 135mm x 9mm
Mass	290 g
Original Price	$299

Externals

The Body

The body is formed out of plastic. It feels pretty solid, but tapping on the bezel or switching pages with a button press sounds a little hollow. It’s about 135mm wide and 203mm tall (236mm diagonal). It fits pretty well in my hand, if not a little heavy.

The front is a white plastic, with the Amazon kindle logo on the top bezel. The back is split into two parts, top 3cm vs bottom 17cm. The bottom part is also covered in a metallic coating, that makes it cool to the touch. This coating is also able to activate the touch screen on my Pixel 2, for what it’s worth. An Amazon kindle logo is located just below the border between the two parts of the back.

At the bottom of the back is the two speakers, and regulatory information.

Buttons

All buttons are located on the front of the device, except for the power and volume buttons. To the left and right of the screen are two sets of buttons, each set having a smaller button placed over a larger button. The bottom of the button sets aligns with the bottom of the screen. The larger buttons both navigate to the next page. The smaller button on the left goes to the previous page, and the smaller button on the right navigates back to the home page. I like how the “next page” button is located on both sides, allowing the user to hold the device with either hand.

Underneath the screen is a keyboard, with a total of 45 buttons. It has the following characters: 0-9, A-Z, ., /, , and Font size. They work as a keyboard, but the buttons are rather small. Also, they don’t have a lot of feedback so it can be hard to tell whether you typed something, especially when paired with the limited refresh rate of the screen.

To the right of the keyboard is a set of navigation buttons. It has a 4 direction + press nib, a Menu button above and a Back button below. This nib is the primary way of navigating around the Kindle until you start reading.

On the right edge near the top are the volume buttons (up + down). Along the top, next to the headphone port, is the on-off switch, which is spring loaded. This is used to wake the device to and from sleep.

Screen

Display

6-inch, 600 x 800, 167 PPI, 16-level grayscale

The screen is a 6-inch E-Ink display. Screen dimensions are 9cm x 12.1cm (15cm diagonal), taking up about 39% of the device’s face. The left and right bezel is 2.3cm bezel, top is 1.7cm. Pixel dimension is 600 x 800, 167PPi. The screen itself goes right up to the edge of the plastic, so there isn’t a ring of unusable whitespace around the edge.

The screen works find for text, and is able to support a variety of font sizes for your reading pleasure. Sadly, it isn’t quite high enough quality for images. I’ve tried reading manga on the device, but there just isn’t enough pixels to properly display things, especially when there’s smaller text. There’s also the whole issue of getting the manga on the device, but I’ll go more into that in the internals section.

Refreshing the screen while reading takes about .5-.9 seconds. Refresh goes from right to left, mimicking the turning of a page. A screen refresh while turning pages does the following:

Toggle pixels to new on/off state (doesn’t change matching pixels)
Toggle all pixels to opposite of original state (not new state)
Toggle pixels to new on/off state for good

I believe it does it in this order to make sure that all the pixels switch to the next state correctly. This process makes sure that each pixel goes through an entire on/off cycle before settling on the new screen. Since it is an E-Ink reader, once the pixels are set it doesn’t require additional energy to keep them there, so power is only consumed by the screen.

When turning the page while reading, it will refresh the entire screen. While navigating around menus and such, it will usually only refresh pixels that are changing, not the entire screen.

Internals

Oh, Amazon, the big A. They sure did a number on the device and really spent some effort locking it down. They really limit the ways in which you can interact with the device. In order to use much of the networking ability of the device, you must first register with your Amazon account. This also gives Amazon the ability to reach into your device and muck about. In a good example of literal Orwellian censorship, in 2009 Amazon remotely deleted certain copies of Animal Farm and Nineteen Eight-four directly from user’s devices. The publisher didn’t have the rights, but this also showed Amazons ability to directly manage files on the device without user approval.

Luckily, there is a jailbreak available which helps open up much of the functionality of the device: https://wiki.mobileread.com/wiki/Kindle_Hacks_Information The jailbreak process isn’t too complicated, and just takes an afternoon. Once the device is freed, you can install a variety of add-ons to enhance the device. This includes adding support for more file and audio filetypes, alternate font packs, and fancy things like USB Networking and a terminal emulator.

My favorite change to make was updating the screen savers. When the device is in sleep mode, it will display an image so something is on the screen. The device comes built in with a good number, but I had a few issues with the images they chose. A few of them advertise amazon services, which infuriates me the most, but some of the other images just aren’t the best for the e-ink screen. They usually have a lot of shading and poor dithering, which just doesn’t show up well with the e-ink. Once jail broken, it’s easy to add your own photos. I used XnViewMP batch convert ability to scale pictures to 600x900 and convert to grayscale. When preparing images, be sure to keep in mind the ability of the e-ink to display shading. This kindle has 16 different shades, but the original only had 4 (like the Gameboy!).

Accepted file formats is another limiting factor of the device. Only the following formats are accepted: MOBI/PRC, TXT, TPZ, AZW, HTML and PDF (Firmware > 2.2). It is also able to play audio AAX files. One filetype conspicuously missing is EPUB. It is thought that is was left out because of it’s lack of DRM, which re-iterates where Amazon’s priorities are. But hey, that’s why jailbreaks exist. I found this website useful to prepare books for loading pre-jailbreak: https://convertio.co/epub-mobi/

It’s not all doom and gloom though. There are some very useful tools available to the reader. My favorite is the highlight tool, which allows you to highlight words and passages. The user is also able to take notes easily by navigating to a location on the page and typing. Both the notes and the highlights are stored in a single separate file, storing the book title, location, date, and passage/note.

I also like the dictionary lookup. By hovering over a word with the cursor, the dictionary definition for it will appear at the bottom of the screen (replacing the progress bar). Mine only has an english dictionary, but I’m sure other languages were/are available.

Do these features warrant adding an entire keyboard to the device? Eh. It’s useful for when you need to use it, but 99% of the time I’m just reading, making the keyboard kinda wasted. Amazon probably thought the same, as we can see from the smaller keyboard on 3rd generation devices, and no keyboard at all on the 4th-gen. Instead, they moved to a touch screen (and built-in ads unless you paid more 😡).

Conclusion

The device I have originally came out in 2009, so it’s been 12 years and it’s still chugging along. I can’t find any dead pixels, and the battery life is still great. I’m not sure exactly how well the battery compares to how it was at launch, but as it is now I only need to charge maybe once a week, reading about 2-3 hours a day. In any case, it’s not difficult to purchase a new battery online and replace it yourself.

It’s very usable as an e-reader, once it’s opened up a little. While it’s not great for comics or manga, books read just fine. The lack of a backlight might be important for some, but I feel that it makes reading feel closer to an actual book, requiring you to be near a light source.

I also find that books I read on in sort of blend together, since they are all displayed the same way. My brain doesn’t have the physical book differences to inform itself, so sometimes I get confused about who’s in what book, but that just means I need to pay more attention.

The screen is a little small, and even with a smaller font size doesn’t fit a whole lot on the screen. The keyboard also takes up a lot of real estate on the device, and it’s value depends on how often the reader is taking notes and such. I wish it had a slide-out keyboard instead, similar to older phones from the 2000s era. It would likely have made the device thicker, but it would give you more screen space while also preserving the ability to type. Newer Kindles and e-readers in general have moved to touch screens, so the keyboard is just on the screen, but then you run into issues with how interactive the device feels because of the screen refresh rate.

Because these devices are “old”, they are easy to find on sites like ebay for cheap. You can easily pickup this or other Kindle models on ebay for less than $50. Even if it isn’t jail broken, you can still convert any e-books you have to the .mobi format and quickly throw them on the device. If you’re looking for books or things to read, here are some good places to look:

Library Genesis: https://libgen.is
Sci-hub: https://sci-hub.st
Project Gutenberg: https://www.gutenberg.org
Internet Archive: https://archive.org/details/books

And a short list of things I’ve read on the device:

The Expanse: Leviathon Wakes - James S. A. Corey
The Big Sleep - Raymond Chandler
Centauri Device - John H. Harrison
The Dream Life of Sukhanov - Olga Grushin
The Forever War - Joe Haldeman
Ringworld Series - Larry Niven
Sirius - Olaf Stapledon

Gemini and IPFS

2021-08-29T00:00:00-05:00

IPFS provides an underlying distributed storage, which allows files and data to be hosted in a decentralized manner. Gemini is a very simple protocol, and while it’s usually centralized from a single server, the .gmi files can easily be hosted on IPFS just like any other file or website. If you’re new to IPFS, be sure to skim through IPFS Concepts Of course, hosting the files is different from serving the files.

Gemini probably couldn’t pull directly from IPFS. Instead, there would need to be a something running between IPFS and the client to properly communicate the protocol.

While a client could just fetch files directly, Gemini still needs to be able to perform transactions and confirm things like TLS. IPFS seems to be primarily accessed through a web browser, so the hosted Gemini files don’t exactly work like they should.

For example: https://ipfs.io/ipfs/QmUAiFm3GfTMQH9HnHhYERLGEHFzZJXPx8Q2JS3crXiPGD This is the Gemini folder for this site, allowing one to see all the files. It sort of behaves like an index, but navigating to any of the pages in a web browser just shows the file, and doesn’t fill in the links (expected): https://ipfs.io/ipfs/QmXpFa5CF3cGwUVDXk3Sq1G5d247frbnCUBEracXXJKzjT?filename=index.gmi .

Instead of both storing and serving content with IPFS for Gemini, it seems that IPFS would only serve as storage, with a Gemini-IPFS proxy(?) acting as the server. Another method might be to add IPFS support to Gemini clients, but then you basically circumvent/throw out all the TLS and certificating, unless the cert is hosted along with the Gemini files…

It’s possible to create a client to read directly from IPFS, but I think that would be expanding the abilities of the client beyond what is needed. I think a better answer would be creating a Gemini server and having IPFS behind that. This also ensures that the files being shared are pinned. Just like how there are HTTP -> IPFS gateways, and HTTP -> Gemini gateways, a Gemini -> IPFS gateway could also be created, or chain them together into Gemini -> Http -> IPFS.

Another thing to consider is updating files and making sure they are available. If the IPFS hosting the node goes down, it’s important that the files are also available somewhere else on the network, which is where pinning services come in. These are services that you post a content address to and they pin the content on their side. This ensures that even if your own node goes down, the files will still be available from the pinning service. Of course, it is best if you also use multiple pinning services, just so there is additional distributed redundancy.

Interacting with IPFS isn’t too difficult, since it is usually done through the daemon instead of interacting with the network directly. I want to do more investigating, but my thinking is it wouldn’t/shouldn’t be too difficult to create simple “glue code” that allows other protocols to easily take advantage of IPFS storage without needing to be aware of it.

Web Browser Cookie Management

2021-08-11T00:00:00-05:00

Cookies aren’t visible to the user. Despite being used on many sites and being an easy way to identify a user, the cookies themselves are almost invisible to the user. With a default configuration, it is very difficult to view and manage the cookies. Because of this, user’s aren’t very aware of cookies sites save on a user’s machine.

Only time people ever think about cookies is when sites bug them to accept all their tracker cookies. I’m sure people’s first thought is not “Ahh, my privacy!” and actually along the lines of “make the intrusive box go away”. I agree with the reasoning behind the GDPR rules around cookies, but in practice it just creates an annoyance for the user. Every site is different, and it is difficult for the user to quickly parse and click through all the menus to select the minimum amount of cookies. There is no standardization in the way “cookie permissions” are asked for, so it’s the responsibility for the sites to get permission for storing cookies. It’s in their best interest to make it as tedious and difficult as possible to reject cookies so they can stuff as many performance and ad cookies on your machine as possible.

While we can’t change how sites ask for cookie jar privileges, we can control how a user manages cookies. By making it easier for the user to be aware of cookies on a site they can more easily take control of their cookie privacy.

From Mozilla: https://developer.mozilla.org/en-US/docs/Web/HTTP/Cookies

Cookies are sent with each request to a website, and serve 3 primary purposes:

Session management - logins, shopping carts, performance
Personalization - preferences, themes
Tracking

For most casual browsing, no cookies should be required. If you’re reading articles or checking out blogs, there is no real reason for you the user to collect any cookies from a site. For some sites, session cookies are unavoidable. If you want to login to a website and have that login carry over as you browser different pages, these type of cookies are a must. The tracking cookies, and to a lesser extent personalization cookies are the cookies we don’t want.

When saving a cookie, the website can configure a variety of options for the cookie, including lifetime, site attributes, and security flags.

Browser Add-ons to Manage Cookies

Managing cookies should become a more visible part of the web browser, similar to password auto-fill and bookmarks. Users should have a way to more easily see the cookies generated by a site.

A short feature wishlist

view + sort all cookies at once
filter/search based on site, reference url
View cookies for current site, with number on icon like ad blockers
delete/manage cookies

Here’s two extensions for Firefox I tried.

Allows you to look at cookies per-domain. Only allows you to really look at cookies per-domain at a time. Can filter on cookie data, either the name of the cookie or the value. Can only search for domain names. No way to search cookies info directly, instead have to nail down the domain first.

By clicking on extension icon, gives you the option to:

View all cookies
Search for cookies from domain
Delete current site cookies (lists # of cookies)
Delete all context cookies, all cookies from current tab container context ( also lists # of cookies)
Delete current site local storage.

You can view how many cookies the current site has by clicking the icon, which is nice.

Cookie manager: https://github.com/Rob–W/cookie-manager

Icon menu has two options:

Open Cookie manager
Open cookie manager for current page

Selecting either opens a new tab, and gives you a search header to find cookies.

Opening for the current page shows all the cookies the current page has.

This one is better because it lists the cookies in a table. It also has more search and filtering options, such as:

Website filter (full url or domain)
filter by name
filter by value
Secure (any/yes/no)
httpOnly (any/yes/no)
SameSite (any/unset/strict/lax)
Session (any/session/non-session)
min/max expiry date
Cookie jar (which container tab)
Whitelist (any/yes/no)

This extension gives you a lot more power to search across cookies, instead of restricting you to domain+cookie like cookie quick manager.

Unfortunately, it doesn’t show the current site cookie count while browsing. It would be nice to see it similar to the way ublock origin shows the count of blocked ads on the current site.

It does provide a lot more power to the user to manage cookies across all websites. Gives the user the ability to easily select and remove 1 or more cookies. It also gives the ability to create a new cookie, as well as a way to import and export the cookies to JSON or netscape output format, as a file or just as text. It is able to import both formats as well.

The big thing missing in the cookie browser is a way to sort the results. You can perform searches really well, but the results are static, and don’t allow you to sort on the different columns. The extension is open source, so maybe I just need to get off my ass and contribute.

More news: Firefox 91 introduces enhanced cookie clearing

Big Tech, Services, and Responsibility

2021-07-11T00:00:00-05:00

Currently, Google and other big tech companies like Facebook, Twitter, TikTok etc. deploys a variety of algorithms and machine learning models to assist in managing and moderating users. From surfacing search results to generating content recommendations, the algorithms acts as a proxy between us and what we request, silently guiding us who knows where. It’s this “guiding” part that has proved troublesome for Google and others (FANG, GAFAM, FAAMG, “BIG Tech”, etc).

Because the platform controls the presentation of content, governments and law makers have determined that the companies are responsible for the content. It has fallen on the companies to moderate the content, which is not an easy task. Just determining whether a piece of content is admissible can be difficult, and as we try and define the line more precisely it starts to create problems. See how Youtube has been removing recordings of war crimes that are used in legal cases. To absolve themselves from responsibility, they need to make the users themselves responsible for their own data and actions. Internally at Google, I think it would be framed as a financial move rather than a user privacy move. Heck, maybe the next step would then be to sell legal support to the same users and make even more money.

Because Google is in control of the various algorithms and the presentation of content, they seem to be responsible for moderating. This move could put the control of the algorithm in the hands of each user, in a “personal assistant” type way. Then the responsibility for content falls directly on the user. When users post objectionable content, law enforcement would go directly to the user’s “assistant” instead of needing to go through Google.

Google’s offerings encompass a wide variety of services. Not only are there services, but also large amounts of data behind those services. For example, Google Search is a service but there is also all the indexing data behind it that make it work. Maybe instead of offering a subscription to everything, they sell the services piecemeal with a one time payment (like Windows), and offer subscriptions to data to back those services. They could also sell service hosting, but the services would be sold in such a way that they could be deployed to any of the public clouds or one’s own private cloud.

I’m sure many of the algorithms used for Google services have machine learning models at their core. One issue posed by making each user distinct is that instead of having one single model, each user has their own personalized models. One problem is that it may be prohibitively expensive to train and manage that many models. Each user would start with a “base” model, but as they use and interact with it, it would start to learn more about the users mannerisms. Keeping a distinct model for each user, and continually updating it could potentially be extremely expensive. Another solution could be to have “tiers”, similar to how users get avatars in the “Metaverse” in the book Snow Crash, by Neal Stephenson:

The couples coming off the monorail can’t afford to have custom avatars made and don’t know how to write their own. They have to buy off-the-shelf avatars. One of the girls has a pretty nice one. […] Looks like she has bought the Avatar Construction Set^TM and put together her own, customized model out of miscellaneous parts.

…

They flicker and merge together into a hysterical wall. […] Wild-looking abstracts, tornadoes of gyrating light—hackers who are hoping that Da5id will notice their talent, invite them inside, give them a job. A liberal sprinkling of black-and-white people—persons who are accessing the Metaverse through cheap public terminals, and who are rendered in jerky, grainy black and white.

Having the user’s pay Google for use of their services and data would allow Google to move away from advertising as the primary revenue stream. They could even still sell ads on it, where the services they offer still phone home to Google for ads to display.

Maybe a solution could be to run the ML models on the user’s device. In the beginning, this would be difficult since mobile phone aren’t the most powerful devices. Google already makes their own ML hardware (tensor processing unit TPU), and their own phone so all they would need to do is join the two together. Then Google would be able to place more custom hardware in the device (like a mTPU) so that the training and learning would take place on the user’s device instead of Google’s servers. This would also help separate responsibility a bit more. To accommodate older phones or to make it phone-agnostic, they could make it it’s own device that would connect to a user’s existing device through USB or maybe encrypted bluetooth or something. It could be designed as a little usb passthrough that would attach to a phone in a pleasing way (integrate with the case or something).

This is all just using Google as the example though. Under the guise of decentralization, companies could move their current offerings to become subscribable services, that the user pays for the use of. Users would be able to self-host their services (or pay an additional subscription to have it managed), and companies would be able to make money off the users. This would all be done to put the responsibility square on the user, while also making money off of them too. Additionally, by putting the tools of moderation and content control in the hands of users, people are able to form their own communities that are responsible for themselves. The trick is having a good enough user experience to coax people into using it.

What benefit would this bring to the user? For one, the models would be completely private to the user. It would allow people to place more trust in the services and know that big ol’ Google isn’t looking over their shoulder (as much…). The services would also be completely personalized to you, so the recommendations could be more exact. I think paired with the improved privacy, people would also be a bit more revealing of their true desires of the service. People may not self-censor as much when they have greater trust in the privacy of a service.

While there would be some benefits to the user, I think in the short term this would primarily benefit the big corporations. While they wouldn’t have as much direct control, they would be free of the responsibility for much of the content users create while being able to create a new revenue stream directly from users, instead of just through advertising.

Update 02/08/2021:

Google just announced their new Pixel 6 phone, and as part of it also announced that Google Tensor would debut on the device.

Echoing what I wrote above, they noted:

AI is the future of our innovation work, but the problem is we’ve run into computing limitations that prevented us from fully pursuing our mission. So we set about building a technology platform built for mobile that enabled us to bring our most innovative AI and machine learning (ML) to our Pixel users.

Future of Special Purpose Hardware Part 1: General Purpose

2021-07-07T00:00:00-05:00

In the 90s, to be able to play the newest games with the best sound you needed a sound card, like a Adlib or Soundblaster. This was done because motherboards of the time period didn’t include hardware for sound. Back them, hardware was much weaker than today and the amount of processing power required to generate good sound was beyond the bounds of the hardware. By moving the sound generation to a separate add-in card, users could choose which card they wanted from a variety of different price and performance points. With the card, the processing required to generate the sound could be offloaded onto the card, only requiring the CPU to coordinate things.

Current performance outlook

Thanks to Moore’s Law and other performance improvements, we have gotten to a point where we no longer need task-specific hardware for computers for most tasks. For tasks like gaming or video editing, it’s usually best to have a graphics card to offload the costly graphics processing onto hardware specifically designed for it. While a special-purpose graphics card processor might be less powerful than the host CPU, the careful joining of software and hardware enable it to outperform the general-purpose CPU on those graphics/parallel compute related tasks.

As we push performance and compute tasks further and further, it can be hard to develop and work on these tasks from a single home computer. To run large workloads, process large amounts of data, or train a large AI, you need more power. Cloud services have filled this niche, allowing a user with enough capital to farm out work onto hundreds or thousands of CPUs or special hardware at once.

This gives use two ways of allocating resources for a task. You can either run it locally, on your own general purpose hardware or in the cloud, where a larger amount of computing power is available for a price. This is similar to computing in the past, where work was done on a central mainframe and dumb terminals were used to interface with it.

Instead of spending days or months to train an machine learning model against a dataset, you can rent out many large GPU instances in the cloud and train through terabytes of data much quicker. Software design has taken advantage of parallel computing, where a calculation can be spread out across many “worker nodes”, allowing the user to arrive at a conclusion faster. For example, Aran Komatsuzaki trained the ML model GTP-J over a period of five weeks, using a total of 1.5e22 FLOPS on a Google Cloud TPU v3-256 (Google’s 3rd gen Tensor Processing Unit, 256 cores, 4TiB Memory).¹ ²

To get a good comparison of performance, I found some benchmarks for current processors to compare against.³ At the top of the site’s scoreboard was the Intel Core i9-11900K, which had a score of 851.2 GFLOPS. Using this number, I calculated how long it would take for the processor to complete the same training. This ignores many factors like memory, bandwidth and storage, so it’s not a great comparison but it gives us a better idea of the work required to train the model.

# Convert training FLOPS to GFLOPS
1.5e22 / 1e9 = 1.5e13
# (Training Flops)/(CPU FLOPS) = # seconds processor takes to train
1.5e13/851.2 = 1.76e10 seconds; ~ 29,137.20 weeks or 558.41 years

The fact that the model was able to be trained in only 5 weeks is testament to the performance gained by parallelizing the task and using task-specific hardware. This sort of task would have been difficult, if not impossible to perform on a local computer. Leveraging additional computing resources from cloud providers can be useful, but there is a large overhead cost you pay for that performance.

To bring the cloud-level performance back down to the local computer, we need to insert an additional compute layer between the local CPU and servers in the cloud. Just like we used sound card for generating sound and graphics cards to offload graphics intensive work from the processor, we might start seeing additional add in cards that assist the computer with certain types of computations. I figure these cards would fall into two categories: “general purpose”, cards whose job is to offload general purpose tasks from the primary processor, and “task-specific” cards, which are designed with a single, specific task in mind.

Expansion Card Foundations

These expansion cards would be similar to graphics cards, but would allow the OS to farm out tasks to free up space on the primary CPU. General purpose cards would be integrated at the kernel level, and assist the kernel in most if not all computing tasks such as virtualization, not just graphics-type work. They would provide additional resources to the computer such as processing, memory, and storage. Task-specific cards would have hardware geared toward a specific task like neural networks, machine learning, or individual programs.

The best place for this general purpose card to go is the PCI slot. It provides a standard, high-performance interface that our GP card can use to talk to the main CPU. It already has a defined API (PHI? Program-hardware interface?), and already has standardized hardware templates designers can use.

These, of course, already exist. One good example is network cards, which help manage network communication, allowing the CPU to focus on other things while the network card does what it does best: moving packets. Network cards are single purpose though, you can’t offload anything but networking onto them unless you code up some custom drivers.

As our desktop computers begin to hit a performance plateau, we will need to look for performance benefits in other areas. Having separate hardware that can be added to an existing computer to improve performance. It also helps make existing computers useful for longer and create less e-waste in landfills. Users would no longer need to purchase an entirely new computer when they need to upgrade, only purchase an add-in card. The card’s modular package and concentrated performance allow a user to easily improve performance of their system without having to put a lot of time and effort into it.

General Purpose Cards

A general purpose add-in card should act as additional processor to the computer, allowing it to offload nearly anything to free up CPU performance for other tasks.

An example from the past could be the CDC-6600. It wa a mainframe released in 1964, and is considered to be the first successful supercomputer. It had an interesting hardware design which is similar to what we are positing here, where it had a single primary processor and 10 other “Peripheral processors” that work could be farmed out to. The CPU had a simplified instruction set, sort of like the forefather of RISC architecture. For more complex tasks, like memory access and I/O, it had dedicated peripheral processors that the primary processor would farm work out to. This allowed everything to operate in parallel and improved the throughput of the machine.

This paradigm of having a single primary and n peripheral processors in the CDC-6600 is exactly what we want to achieve with the add-in cards. The additional compute power could be tapped in to, allowing the primary processor to focus on other tasks while the add-in card covers less important things. The card would bring extra processing power to a system, and also potentially add additional system resources like RAM and disk. A general purpose card would provide more computing resources to the primary hardware.

This card could be integrated at different levels into the host hardware/OS. One option would have it present itself to the OS simply as additional hardware. It would be just like adding an additional ram stick or hard drive, which would be presented to the OS. It would be integrated at the kernel level, outside of user space. The host system OS would manage the hardware resources on the card. The card itself wouldn’t be making too many decisions, it would simply be more raw resources for the system. This type of card already somewhat exists that target single, specific components like a hard disk.⁴ I couldn’t find a dedicated RAM card, but I did find a PCI card that a user would mount RAM on and use as a hard disk.⁵ I figure a RAM-only card wouldn’t perform too well, since RAM depends on tight latencies with the processor, so RAM by itself my perform better as an additional layer of cache rather than memory. If the card ends up being a “system on a card”, with it’s own CPU, RAM, and disk, I could see it being more feasible since calculations would be taking place primarily on the card, rather than across the host hardware and the card.

Another option could be to have the card act as a distinct computer. This would be like sticking a raspberry pi on a PCI card. In this scenario, it would present itself to the host as a completely separate computer, with it’s own hardware and OS. Instead of the host computer accessing the hardware directly, it would instead interface with the card’s OS, where it would offload larger, packaged computing tasks. The host wouldn’t interact the card’s resources directly either. It might be aware of card statistics, but the host wouldn’t be able to directly control the card’s resources. The card OS would manage the incoming work and how to best apply the card resources to the task. This would allow the host to not have to worry about the operation of the card itself, just the tasks its sending to the card.

One last option I could imagine could be a add-in FPGA card. An FPGA (field-programmable gate array) allows a user to “program hardware”, where computer hardware is described using a language like verilog and implemented in configurable logic blocks. Because it can be configured for specific tasks, an FPGA will usually perform quicker and more efficiently at a task than a general purpose CPU. This is because the different algorithms and operations can be implemented using hardware instead of software. An FPGA add-in card could be dynamically programmed for certain tasks as the host CPU requires it. As the host need to compute different things, it could re-program the FPGA on the fly and send data through it. This would have a much higher barrier to entry for a user to do themselves, so it would probably require pre-compiled task configurations that would be available to download and use. While not as easy to “slot in” to existing system architectures, it could provide a large, configurable performance boost to pre-designed tasks. This type of card already exists, but can be prohibitively expensive, especially the top-of-the-line cards.⁶

Specialized hardware

The cards provide a stable platform on which to build on top of. Ability to add great performance at a cheaper cost (Don’t have to re-buy entire computer, only the card)

Gives us the chance to make specialized hardware available to normal computer.

Example: https://spectrum.ieee.org/computing/hardware/the-future-of-deep-learning-is-photonic Matrix multiplications using light photons instead of transistors. It’s very specialize dhardware that you would never find integrated into a CPU or motherboard. By having it on an add-in card, easier to seel since buyers only need an existing computer, not an entirely new setup

GPT-j model training blog post: https://arankomatsuzaki.wordpress.com/2021/06/04/gpt-j/ ↩
Google TPU info: https://cloud.google.com/tpu/docs/types-zones#europe ↩
Intl core i9 9900k benchmark: https://gadgetversus.com/processor/intel-core-i9-9900ks-gflops-performance/ ↩
Normal, run-of-the-mill PCI-e hard drive: https://www.newegg.com/western-digital-1tb-black-an1500-nvme/p/N82E16820250159 ↩
Use RAM as a hard drive: https://www.newegg.com/gigabyte-gc-ramdisk-others/p/N82E16815168001 ↩
$10,119.38 FPGA card: https://www.mouser.com/ProductDetail/BittWare/XUPVV8-0001?qs=vmHwEFxEFR%252BvJu%2FaBspDgw%3D%3D ↩

Molecular Generation and Long-Term Processing

2021-06-26T00:00:00-05:00

I recently came across an paper from 2012 about generating organic molecules: Source Archive

The researchers spent large amounts of time performing an “enumeration of organic molecules up to 17 atoms of C, N, O, S, and halogens, forming the chemical universe database GDB-17 containing 166.4 billion organic molecules.”

Despite creating more than 400GB compressed, it only captured a small subset of all possible and impossible molecules. It used different tricks such as using basic structures, using known molecules as templates, and swapping out molecules with those of similar electron valence.

This dataset was produced in 2012, and since then many papers have used this paper and it’s data in various studies. Additionally, the authors have continued to update the dataset as recent as 2019. While it represents a large amount of possible organic molecules, it is in no way a comprehensive source. Even so, it is still useful today as a “filter” for generating additional molecules or as input data for machine learning models.

This was all nice to read about, but it made me think about what sort of long running, complex calculations we’ll make in the future. Eventually, we will reach a processor ceiling, where speed of light / laws of the universe limit the amount of computing power possible. We’ve already seen the start of this as we are creating finer and finer pcb generation processes.

Because of this limit, there could be a point where calculations we wish to perform are so complex or thorough that they take a long time to run, even with top of the line future-computers.

Things like neural network training, modeling and simulation can already take a long time now. As things become more complex and start pushing up against limits, calculations could take years to complete, just like the computation of the meaning of life in Hitchhikers Guide to the Galaxy.

Algorithms will be designed to take into account the long compute time. One easy method that we use now is paralellization. We are able to run workloads on many computers at once, spreading out the work.

Another method could be to design algorithms that generate an output over time, instead of starting, calculating, and producing a single answer at the end. By splitting the output into chunks emitted during the calculation, we can take advantage of the results sooner, even if they aren’t complete yet.

A similar method would be to allow input data to be updated. As this long running calculation takes place, other computers may start calculating the same thing, and it’s results could be useful to assist. If a computer is generating organic molecules like above, the ability to make the running calculating aware of externally calculated results can help it need to do less work, and speed up the calculation time.

The greatest benefit may be to start the calculations as soon as possible. If we can get them started now, then they will finish sooner. Even if it isn’t the most optimized algorithm or hardware, it will at least make progress that could be recycled into the next attempt to improve it’s performance.

As algorithms and systems are improved, the data itself will also undergo transformations. As we’ve seen in the past, trying to make a single, perfect format is impossible. Every format has benefits and tradeoffs, such as readability, storage format, or compression. Better to choose a format that works best for the time being, just to get you started. I think it is better to focus on creating good tools to transform/translate/convert data from one legacy format to the newest and best format. That could be a format in and of itself. If it was, I think it would be important to have it include a log of the previous transformations it went through, and the additional data added as a result. This would act the same as identifying primary, and tertiary source in historical research. There’s the original data of course, but often we can take that and draw further conclusions based on it. It would be important to include this metadata, but also properly record how it was generated.

With a proper way to manage data, it will make it easier to integrate it into calculations we wish to make, and set us up for better performance in the future. That just leaves the question: What calculations should we be beginning now?