Groups and Rust Macros

The post is a continuation of this previous post and is taking a bit of a different approach to the problem. The limits on the expressivity of the type system, and the requirements for safety imposed by the compiler proved too much for my knowledge of Rust last time. This time I am taking a bit of a different approach. One discarded approach from a prior try was to define a tuple and iterate over the tuple applying the operations at each step. This did not in fact scale to an arbitrary number of product elements - each length would have to be defined in source. As far as I could tell there was no overarching simplication or abstraction that I could find that would fix this problem within the contraints that I have imposed on myself.

Earlier today, while thinking about my team’s practicum project and the associated project planning tasks, I was hit with a wave of inspiration. I had some ideas on how to use the macro system in Rust to implement the desired behavior for these types while keeping it general to direct product groups of any size.

The idea is this, if Rust is able to define struct tuples and do derive macros (#[derive(Debug)] e.g.) - and if these work for the Default trait (it does) - then there should be a memory safe, reliable, and consistent way of deriving debug for my case. Here, instead of calling the ::default() function, we call the .op function (with an argument), the .inv() (for getting the inverse of), and of course identity() (gets the group identity) doing much of the same as “default” on that one spcific function

Think:

#[derive(Default)];
struct MyType(Type1, Type2, Type3);

// MyType::Default() = MyType(Type1::Default(), Type2::Default(), Type3::Default())

but applied to the groups types. Its gotta be possible!!

Till next time

  • Jack

UnitaryCon and ROCm

This post is not a continuation of the last post - however I do have an additional update to it. This sort of multiple dispatch across a defined interface (or Trait in Rust’s case) might not be warranted here. Generally, a faithful Rust implementation of similar behavior would be in the form of an enum - with cases for each value (the overall type here being a group, and the enum variants would be the various groups I want to define). This is the idea of “composition over inheritance” here at work. The only issue is the extra information and constraints that the inner enum variants give on type operations. Because Rust’s type system is so expressive, I was relying upon the compiler to (at compile time) throw errors if incompatble types (different groups) had operations defined against eachother.

I got pretty far with the last implementation - with nice working groups and compile time checked operations on those groups. The issue that stopped me for now was trying to define a direct product - which would also implement the group type, and all operations on it would be type checked at compile time (hard).

For summer updates, I will have to keep it brief - contact me directly if you want to know more. I worked an interesting embedded software position and had a really successful project outcome. Loved the folks I met and didn’t have to move again (how great!).

In other updates, I recently attended the open-source quantum computing software ecosystem conference run by the unitary fund. It was a fantastic conference in Helsinki, mets lots of awesome people, and partook in the Finnish fondness for saunas. I do miss how well the suspensions on their public transit rail lines kept the car from reacting to bumps - I nearly fell asleep it was so quite and smooth.

Been playing a bit with ROCm recently and looking at the gaps that exist for it in its co-existence and fight for relevance against Nvidia’s CUDA. I am optimistic, this sort of firmware and encouraging competition in this part of the tech landscape I am very much for. If I had additional time, I would try a become part of the folks that are working on improving this every day - Nvidia’s moat here may not be as deep as the markets seem to think. I have also been playing with some of the Rust large language model stuff that’s out there now (see kalosm and floneum.

Till next time

  • Jack

Modeling Algebraic Structures

This is a bit of a continuation of the last post here

Cyclic groups were implemented since the last post and both now implement a generic Group trait. This comes with its own unique benefits and drawbacks. The benefit is that a set of operations that every Group must implement can be defined and centralized in a single spot in source - thus the definition of the interface provided. The downside is that the way I currently have the Group trait implemented and its generic associates with some methods returning Self - I am now fighting my way through the limitations of Rusts dynamic dispatching. Specifically fighting my way through this problem:

error[E0038]: the trait `Group` cannot be made into an object
  --> groups/src/lib.rs:34:25
   |
34 |     components: Vec<Box<dyn Group>>,
   |                         ^^^^^^^^^ `Group` cannot be made into an object
   |
note: for a trait to be "object safe" it needs to allow building a vtable to allow the call to be resolvable dynamically; for more information visit <https://doc.rust-lang.org/reference/items/traits.html#object-safety>
  --> groups/src/lib.rs:26:25
   |
25 | pub trait Group {
   |           ----- this trait cannot be made into an object...
26 |     fn op(&self, other: &Self) -> Self;
   |                         ^^^^^ ...because method `op` references the `Self` type in this parameter
27 |
28 |     fn inv(&self) -> Self;
   |                      ^^^^ ...because method `inv` references the `Self` type in its return type
29 |
30 |     fn identity() -> Self;
   |        ^^^^^^^^ ...because associated function `identity` has no `self` parameter
   = help: consider moving `op` to another trait
   = help: consider moving `inv` to another trait
help: consider turning `identity` into a method by giving it a `&self` argument
   |
30 |     fn identity(&self) -> Self;
   |                 +++++
help: alternatively, consider constraining `identity` so it does not apply to trait objects
   |
30 |     fn identity() -> Self where Self: Sized;
   |                           +++++++++++++++++

For more information about this error, try `rustc --explain E0038`.

The useful reading is of course in the docs

and the output of rustc --explain E0038 is also helpful.

This is unfortunate however, and I am worried that I won’t be able to have the interfaces and datastructures I desire to model the next part - direct products of groups. I always imagined them as tuples where each component of the tuple was an element of that specific component’s group. Like for the group S_4 and C_4 denoting the permutation group on 4 elements and the cyclic group of order 4 respectively, a ∈ S_4 x C_4 where a = (a_1, a_2) s.t. a_1 ∈ S_4 and a_2 ∈ C_4. Ideally, the source code, datastructures, and interface would closely resemble this and yet I have this problem. Guess I have some reading to do on those pages - and maybe on generic associated types? (GATs)

Putting this on pause for a little bit to do some homelab-ing/life stuff/organization.

  • Jack

Permutations in Rust Code

This is a bit of a continuation of the last post here

I have implemented a very simple permutation group bit of code. The idea behind the design of this showcases why I think algebraic type systems are so powerful. Simply put, only operations between permutations that act on the same number of objects make any sense. This of course is usually not a problem when the permutations are of different lengths, its always easy to insert an identity map to additional elements on the smaller of the two then proceed, but leveraging rusts type system to ensure that operations accept operands of the same group is a powerful thing.

I decided that the representation of a permutation should be an array. Each index of the array contains what that element maps to. If I wanted to represent a permutation in which 2 items swap, in cycle notation it would be written like so: (1 2).

Looking at the internal array for this permutation, it looks a bit strange: [2, 1]. There is a bit of a tension here between standard mathematical notation and computer programming, although unimportant. In standard permutation notation, 1 is the first element. Thus this array is saying that 1 maps to 2, and 2 maps to 1. Of course in code, the indexcies are off by one. Why does an internal implementation conform to such arbitrary standards? Mainly cause of my comfort with existing notation.

Here are some key snippets from the code

#[derive(Debug, Clone)]
struct Permutation<const SIZE: usize> {
    map: [usize; SIZE],
}

and this one

impl<const SIZE: usize> Permutation<SIZE> {
    fn compose(&self, other: &Permutation<SIZE>) -> Self {
        let mut map_copy = self.map;
        for index in 0..SIZE {
            map_copy[index] = other.map[Self::index_from_elem(self.map[index])];
        }
        Self { map: map_copy }
    }
}

pretty great! This allows chaining compositions like so:

let s4_1 = Permutation::<4>::random();
let s4_2 = Permutation::<4>::random();
// e • (s4_2 • (s4_2 • s4_1)) = ??
dbg!(&s4_1
    .compose(&s4_2)
    .compose(&s4_2)
    .compose(&Permutation::<4>::new()));

Thus if s4_1 = (1)(2 4 3), s4_2 = (1 3 4)(2), and e = (1)(2)(3)(4) per usual…

e • (s4_2 • (s4_2 • s4_1)) = (1)(2)(3)(4) • (1 3 4)(2) • (1 3 4)(2) • (1)(2 4 3) = (1 4)(2 3)

or as output:

[src/main.rs:103] &s4_1.compose(&s4_2).compose(&s4_2).compose(&Permutation::<4>::new()) = Permutation {
    map: [
        4,
        3,
        2,
        1,
    ],
}

Excellent!

Next, implementing the cyclic groups

Rubik's Cube pt 2

After implementing a small (and painful) visualization for the bit of code I have been working to model puzzle cubes, I came to realization about representation. I was hoping that a more elegant way of describing movements of the cube would lead to a more elegant solution of programming such a solution. I read over the majority of the document mentioned in the previous post and pondered a bit on it. I wondered how minimal was my representation (modeling each visible face of each cubie affected by rotations - not centers) compared to a full mathematical description. This document was very interesting, its detailing of the groups that describe each of the 4 components of the cube (position & orientation for edges, position & orientation for corners), it review of group actions, basic information on permutations, and a little on orbits was great. In fact, the biggest takeaway from all of this, which was given towards the end, was that given a group action G, that acts on the set describing the cube’s configuration: “The orbit of the start configuration under this action is exactly the set of valid configurations of the Rubik’s cube.”

This is great, because it helped me conceptualize a bit better what an orbit can mean, as well as relate what I see in reality with cubes with the mathematical representation discussed there. What it means for a configuration to be ‘valid’ is somewhat ignored, until this very last section. It might surprise some folks but it is not possible to solve a Rubik’s cube with only valid moves if a single corner or edge is flipped. This is due to the fact that not all possible configurations of edge and corner piece positionings and orientations are possible from the start configuration (though sometimes it can happen through other means).

Regardless, while contemplating how the permutation cycle notation could be used as part of the software representation instead of the crazy repetitive setup I have going on right now, I realized that its not hard at all. With each piece, I apply the permutation the cycle notation defines to know where it should be in the end state. This is perfectly convenient as computing locations for each cubie could be done just via the stored permutation from a known state. This stored permutation can be updated by successive moves of course by simply composing the permutation from the group of moves (group action G) with the current permutation and storing the result. These computations may be a bit easier to reason about, rather than the crazy indexing I have going on right now.

One thing of note though, is that the crazy setup I have going on right now is still fairly minimal. The permutation cycle notation will still need to encode the orientations and positions for the edge and corner cubies respectively. This means 12 edge cubies * 2 orientations + 8 corner cubies * 3 orientations = 48 things to track here. However, my current implementation is using a 48 element array to track the state of the cube. The most minimal representation I can currently think of would be something along the lines of the number of bits required to store which element of S8 x S12 x C3^8 x C2^12 = 519 quintillion ~= 69 bits of information. It turns out the number of valid configurations (ones in the orbit of legal moves on the cube from the solved state) is exactly 1/12 of the number of total configurations (this is shown in Theorem 11.1 and stated in Remark 11.15 of that document). Thus, maybe there is some way to reduce the representation down to something on the order of ~66 bits, however I do not know how I would do so at the moment.

Next on the chopping block, implementing permutation arithmetic operations to calculate state with this new representation!

Rubik's Cube

I am working on some software for working with digital Rubik’s cubes in Rust. I think it would be an interesting challenge to write a proper, well-tested, simulation of the cube and cube movements. I am currently working on the back-end representation of a 3x3 cube which I will then later create some interface for some interactive mode.

My first pass models the whole cube via what is on each of its 6 faces, but without the center cubie color as each move would leave it unchanged (but possibly rotated). The faces are labeled 0 to 5 (Bottom, Front, Right, Back, Left, Upper) and the colors are 1 through 6 (White, Blue, Red, Green, Orange, Yellow) respectively. The starting orientation of the cube puts the White face as the bottom face, and the Blue face as the front face (thus the Red face is on the right face).

This is a usable model, however, I am aware of the algebraic representation of the Rubik’s cube and modeling it and its components with group actions. The downside of my model is that much of my code is repetitive and very dense and thus difficult to understand. A transformation associated with a single face rotating means describing how face of these cubies relates to the previous state, which means lots of array accesses everything is highly index sensitive - very prone to programmer error.

Fortunately, with sufficient testing it is possible for me to convince myself that I’ve done it correctly. I tested certain properties of moves are true that should be true on a real cube.

As I continue to work on this, the next steps are:

  1. Making a decent tui visualizer of the current state (probably just going to print out a few grids with color options in Rust)
  2. Making some sort of interactive mode taking in user input and spitting out the next cube state.
  3. Finish reading this: – https://people.math.harvard.edu/~jjchen/docs/Group%20Theory%20and%20the%20Rubik%27s%20Cube.pdf – which to rewrite the internal representation.
  4. Track orientation of middle-of-face cubies (for those image puzzle cubes)
  5. Consider how this can generalize to larger cubes

And then other consideration in no particular order:

  • Some sort of solver, maybe using some sort of Meet-In-The-Middle technique
  • Some sort of algorithm explorer
  • Better visualization for cube - perhaps even 3-D?

Machine learning with gzip

I am not sure in which context I originally stumbled across the concept presented in this paper. In that paper, the authors presented a technique in which they would use a lossless compression function (gzip) + compressed distance metric (Normalized Compression Distance) + k-nearest neighbors (k-nn) for text topic classification. I liked this paper when I first learned of it because it is a parameter free model (hold the k hyperparameter), which is against the norm for other popular models in the space. I am no expert on NLP (although I have worked with some other areas of machine learning) but something that I can certainly appreciate in the era of multi-billion parameter language transformers is a simple idea applying existing tool in an effective manner.

One thing of note for this technique however is the runtime complexity of k-nn. Computing gzip is performing the DEFLATE algorithm which is a two step process of Huffman coding and then LZ77. A number of places on the internet said that the runtime complexity of this was \( O(n) \), where n is the size of the uncompressed data. I could not find any credible sources doing out the analysis and when I started digging I gfound very few answers (some more information can be found at these two wikipedia articles: Huffman coding and LZ77).

Instead I opted for a more empirical approach by just measuring gzip’s performance on large bodies of data. Firstly, I generated large files of random data using the following command:

for arg in 1K 5K 10K 50K 100K 250K 500K 1M 5M 10M 25M 50M 250M 500M 1G; do     head -c $arg </dev/urandom >"$arg.rand"; done

and the getting gzip timing by running:

for arg in 1K 5K 10K 50K 100K 250K 500K 1M 5M 10M 25M 50M 250M 500M 1G; do     time gzip "$arg.rand"; done

This gave me some data that I have saved here and ran a regression against. It looks like a linear regression is sufficient here and anything lower order (I tried \(log\, n \) for example did not work great). So for now gzip has empirically a linear runtime complexity (tested up to 1 gigabyte). The x-axis represents filesize before compression, the y-axis is seconds to compress (user + sys from time command) and the x-axis has a logarithmic scale.

The other components in this algorithm are interesting too. For example, the authors propose Normalized Compression Distance (NCD) as a means to compute the distance as used by k-nn. This metric is not complicated to compute, the formula for which is given in the paper, where \( C(x) \) is the compressed length of \(x\) and \(xy\) denotes the concatenation of \(x\) and \(y\).

\[ NCD(x, y) = \frac{C(xy) - \min \left\{C(x), C(y) \right\}}{\max \left\{C(x), C(y)\right\}} \]

Computing the Normalized Compression Distance between two texts \(x\) and \(y\) will require computing the compressed length of \(xy\) as well.

And of course, the aspects of an implementation of k-nn with its own runtime complexity as well. This I am choosing not to derive here out of respect for my time and the brevity of this article.

A few notes here at the end on this paper. They used the metric Normalized Compression Distance as a stand-in for information distance (or \(E(\cdot)\) which is uncomputable because of its dependence on Kolmogorov complexity. The idea is that as the compression ratio of gzip becomes higher, it will eventually approach \(K(\cdot)\), thus \(NCD\) approaches \(E\).

The next note I had here is a video on optimality and related to kolmogorov complexity (specifically on the algorithm proposed in the uncomputability section of that wikipedia article: “The most powerful (and useless) algorithm” - polylog and its percursor: “The OPTIMAL algorithm for factoring!” - polylog.

Upcoming week of work

Have a good amount of large projects for my coursework coming along, thus will unfortunately be pretty busy. I am looking forward to heading to my mom’s house for Thanksgiving this year. Lillian and I will probably get a real tree from the tree grove which we will decorate once we get back. Current ongoing projects for my coursework are the advanced formal methods project on LTSA, the quality assurance project on the card game ‘Love Letter’, and wrapping up some personal projects.

For some reason, ligatures are not supported on Emacs for the new font I was discussing in the last post JuliaMono. I see that there are a few packages and setups that do have ligatures working (seems like nothing is out of the box for this feature) but none of them look super simple/appealing to me.

I enjoyed an older movie last weekend that I recommend to anyone who enjoys a good comedy and a progressive tone: Auntie Mame.

I am almost to the point where I need to clean out/up and backup my old PC. Its being converted into a Steam Big picture mode-esque machine and I will need to move all of my files off of it onto another machine. Hopefully, with a bit of effort, collecting all of these documents/files together, organizing them, and properly backing them up (and up to the cloud)

  • Jack

Small and upcoming things

The first version of this blog post was lost, which is sad because I am normally so good about having things autosaved. Anyways, the past two posts have been on the larger side and I tend to find that smaller posts suit my style, my free time, and my writing motivation. I have a few posts coming down the pipeline but the are a little big and I have some things that I want to share now. For example, check out this cool font that I have been trying out for the past few days:

JuliaMono

I think it is easy to read and a bit more playful than my standard monospaced font for nearly everything in my life: RobotoMono Nerd Font Mono. Currently I am using this font in my emacs configuration as well as in my terminal emulator (kitty).

  • Jack

Rust Learning and Projects

During the COVID-19 pandemic, I made it a habit to add something to this blog every couple of days. I was busy then (surprise!), I am busy now. What changed? Priorities. The phrase “you have to make time for it” was something that didn’t sit well with me for a number of years. When I first heard the phrase, it was probably in reference to some responsibilities that I had shirked on, thus the negative association. I have come around to truly appreciate the phrase for what it is and use it myself. If I don’t prioritize something, it will not happen. I will fill my time with other things first and whatever task it is will either be forgotten or discarded. I am not discarding a number of things, including this webpage. Eventually, using this page and others, I will organize and collate the important items in my life digitally. This will probably take a long time (probably a lifetime). For now here are a few projects of mine that I would like to showcase over the past few years particularly using the Rust programming language.

Here is my fork of the rustlings project with my solutions as patches applied on the trunk. I added a small bit of CI for Gitlab to ensure that my solutions complete the exercises correctly. I periodically update this by pulling the upstream changes into a staging branch, rebasing my changes onto that branch (fixing any conflicts that appear), making small tweaks to exercises that have been updated, getting CI to pass, then merging into the main branch. This means that I have quite a collection of my own solutions to a number of exercises that test knowledge on the basic components of the Rust programming language.

This project is simply a TCP server that takes messages sent over the socket to the server and resends it to all connected sockets/clients. It is a fun way to spin up a really small chat room really quickly (assuming that you can telnet over your local network and type in all the right ip addresses :-)). I was going to turn this project into something with a small tui (text user interface) but UI is never really my game.

This project was my final project for the intro to operating systems course at University of Massachusetts Amherst. This undergrad course allowed for a project type of your choosing - as long as it was an individual project. Most of this course was done in C and C++. I wanted to do a Rust rewrite of one of the projects, and after some debating on which project could remain as faithful to the original implementation, I decided the filesystems project was a good start. The implemented file system is extraordinarily limited, and a file is used as a proxy for an actual block device, but it does have some of the basic components of filesystems implemented (free block list, inodes, block pointers). It does not implement in its current state many essential features of modern filesystems such as directories, permissions, superblock(s), indirect blocks, journaling, etc. Finally, this project contains multiple layers: extensive documentation, unit/system testing, github ci/cd, and a few other tools. This implementation also attempts to remain on feature parity with the project itself. This cause me to learn about the #[repr(C)] macro in Rust which forces the underlying representation to match that of a C struct. Initially, I got the filesystem to work, but because the Rust representation of data and the C representation of data were no identical, the data written to the proxy block device was different between the two implementations.

I have a few other small project written in Rust that aren’t worth sharing here - either because most of the code was derived from someone/somewhere else or because the project is so incomplete that it does not work/isn’t worth sharing. A good example of this is the two-part project that initial motivated me to learn the Rust programming language: C library of datastructures as presented in CS187, and a C library of algorithms as presented in CS311. These two projects are related to eachother in that I wanted to use the datastructures library in the implementation of the algorithms library. Both I chose to write in C for a few reasons: practice in C, manually managing memory, difficulty, performance. These two projects never came to full fruition, however the datastructures project did make some strides (see libds).

I lost some interest in the project when I started dealing with the dependency injection required for custom types. Say you’re making a binary tree out of char[] types, how do you determine the strings ordinality? The answer is you can’t for every type, at least not within the library. A way to do this is let the user specify their own comparison function and inject that function to be used inside the implementation of the binary tree. This uses functions pointers, “void *” types, and feels a little like bending C to feel more object oriented, all things that were displeasing to me even though they were (maybe) necessary evils. The answer for me here… a language that had a more comprehensive type system while still remaining low-level - thus Rust. Essentially, I wanted to be able to have the power of generics (like C++ templates really), without the horrors of C++. Eventually I will take those projects and resurrect them from the land of C and put them in Rust so that I may work past those barriers I ran into long ago, but for now I have much more pressing tasks to attend to.

The final word on Rust projects I have tried includes some very low level work trying to write something to make a disk image that boots via BIOS and then hangs, a dumb kernel if you will, to learn some of the necessary steps of booting to an operating system. I had a good name for this ‘ferrOS’, probably derived/inspired from ferrous-systems and the work that I’ve seen by folks like phil-opp and japaric. For some time I was following the blog by phil-opp here.

That’s the most complete list of Rust projects and information about them that I can gather for now.

  • Jack