Continuous Integration with Git, Jenkins, and Phabricator

I don’t normally write about my day job, but the subject of this post is something I put a lot of value in. Over the past year I was in charge of researching, deploying, and maintaining modern software development tools at my workplace. It’s been an enlightening experience and it’s given me a firm understanding of the motivation behind these tools. Prior to this, the company had a draconian system for source control which consisted of senior engineers copying source over the network and using a visual merge tool to keep up a known good version of the source.  The difference has been night and day after introducing these tools, so I will be spending today’s post discussing the benefits of continuous integration (CI) and how it enforces high quality of source.

I’ve mentioned Git in a blog post long ago so I won’t talk about what it is or how it works here. Instead, I will comment on how you can go wrong with Git and how to integrate CI with it. With regards to the former the “Git Flow” model almost encourages bad practice. It entangles the commit tree by mandating these separate master, develop, feature, and release branches connected via a stream of pointless merge commits. In order to keep up the clarity and utility of a Git log there should ideally only be one master branch on origin. Developers must make sure that their local branches are continuously rebased on the latest origin branch. Only when a topic branch is complete can it be pushed to origin as one commit. The commit message should completely describe the feature being added and nothing else. This model ensures that the developer is continuously integrating the latest source into their development branch which minimizes conflicts and bugs. It also ensures that each commit on the origin branch signifies a meaningful version of the source that someone would conceivably checkout. The commit log is then a very clear sequence of new features and bug fixes being introduces.

With CI embedded into our source control, we now look into how to make it better with code review. In a pre-commit code review workflow, colleagues must review and approve code changes before a push to origin. While there’s the obvious benefit of having a second pair of eyes to catch common errors in code, the true benefit of code review come from the social aspects. I don’t have scientific research to back this up, but I’ve noticed the following social benefits of code review:

  • The idea that someone else will be looking at my source makes me code better.
  • Reviewing the source of others enforces style consistency and improves the understanding of the rest of the code base.
  • Constructive criticism from code review provides a positive feedback loop for developers looking to improve.

Programmers today are familiar with GitHub. It provides source control and a platform for collaboration around software. Code review here works using pull requests. The only issue is it’s SaaS and you have to buy a subscription for things like private repositories which was more than my budget (or lack thereof). To get around this I chose to deploy Phabricator which was open sourced by Facebook. All I had to do was salvage an old computer and self-host the application on the company’s intranet. Just like that, the company suddenly had a platform for documentation, task tracking, and code review. By themselves, documentation and issue tracking are critical to software development, but that will be a discussion for another day. Just know that we went from “tribal knowledge” to having a wiki for documentation. On the issue tracking side we went from email threads to a working ticketing system. Phabricator completely changed our workflow in ways that CI alone could not have.

Finally, we look into automating the review process and providing a truly continuous feedback loop for source pushed to the main repository. This is where Jenkins comes into play. Initially, my task was to automate the build process of our software and take that responsibility away from our company president. Jenkins does this at the click of a button by cleanly checking out the latest source, running a build script, and archiving all the resulting software artifacts. Its role in CI is to apply this process continuously on every commit that gets pushed to the repository. If the build fails for any given product due to an unrelated change the server immediately notifies the developer who must correct the issue. This prevents masking issues that would before go unnoticed until the time comes to build a new version particular product. While our source was never designed for test from the beginning, I’ve introduced a number of automated testing scripts on the built artifacts to Jenkins in lieu of unit testing. These tests make sure the basic operation of our products continuously without the need for human testing resources. It serves as an extra layer of automated quality control which source must pass before being truly committed.

With these tools we are able to complete the CI workflow which enforces high standards for source code being committed to a repository. Git ensures that the developer can always test against the latest source. Phabricator adds a feedback based quality control check and social pressure for high quality contributions. Finally, Jenkins automates the grunt work of building and testing each contribution against each product before giving the final approval. The result is that only the highest quality source code possible gets pushed into the repository. Adopting and deploying CI has changed the way I develop and I can’t think about ever going back to anything less.

Buffet Tracker and Winning My First Hackathon

A few weeks ago I was skimming one of the bulletin boards at the university for any interesting lectures and a poster for “Hacking Eating Tracking” caught my attention. The idea of attending a hackathon has always intrigued me, but I was always too intimidated to attend. As an introvert it was the anxieties of forming a good team, not being able do anything of value, and not having the requisite skills that built up the barrier to entry. Still, after reading through the website and mulling it over a few days I decided to apply. My motivation for doing so came from my recent attentiveness to healthy living, the emphasis on hardware skills for this hackathon, and some wish to combine the two. Having now made it through the hackathon I can say that my earlier anxieties just didn’t make sense. In a way my experience has been about making those kinds of anxieties easy to conquer. Think of a hackathon as a place to meet interesting people, build things that aren’t perfect and an opportunity to learn new skills. Just don’t be afraid to go to a hackathon for the wrong reasons.

Prior to the hackathon, there was Slack setup for people to meet and discuss projects and logistics. I commented on a few ideas and proposed my own for tracking grocery ingress/egress by scanning bar codes but nothing really came of it. It wasn’t until I read Anandh and Arjun’s proposal to continuously track the weight of serving vessels in a buffet style restaurant that I saw an interesting project to work on. It was possible to carry out, the solution was non-obtrusive to the participants being tracked, and the data had a lot potential. After making contact with them, all that was left was to wait it out until September 18th rolled by.

The event started with a kickoff talk where I first met all the other hackers. What surprised me was the breadth of different backgrounds everyone came from. In terms of age I found underclassmen in undergrad up to working professionals, though admittedly the skew is towards university students. More unexpected was everyone’s point of origin. There was an entire bus of hackers that came from McMaster in Canada, a handful of people were flying in from western Europe and people came from all around the northeastern states. Technical backgrounds included health, computer hardware/software, and data science but the overwhelming majority were computer science students. I honestly felt like the minority being the computer generalist working in Boston.

Team forming took place the following morning where I had my first glimpse at all the projects being attempted. We weren’t able to permanently recruit anyone to join our team which was a bit disappointing at first, but it worked in our favor to have much lower synchronization costs. After the teams coalesced we started to get some work down and fully develop our idea for “Buffet Tracker”. The premise was to track eating behavior in a buffet style restaurant because it’s case where people are financially incentivized to over consume. To do this we wanted to link a change in weight of a serving vessel and link it to a specific plate when a person goes to take food. The result would be a dataset that has all the food procured from the restaurant’s serving area and proportion of food on plates throughout the day. With that kind of data you are able to start looking into research questions such as “What foods occur together with the highest probability?”, “What restaurant layouts result in the least(or most) consumption?”, “What is the nutritional content of the average plate?”, or “What time of day are people most likely to eat a certain type of food?”. Too many of the other projects focused on tracking at the individual level which tends to need some voluntary action or increased overhead/complexity. Our project’s strength was to avoid that completely by trading off granularity for a completely invisible solution to the problem.

The Buffet Tracker Proof of Concept

In order to get this all to actually work we settled on tracking plates via RFID tags hotglued. At each serving station there will be a microcontroller responsible for measuring a change in weight and an RFID reader for detecting the plate. When it detects a procurement event it compiles the plate ID and the change of weight and reports it to a local computer which parses the data and adds it to the database. A web server then presents a front-end for querying the database and presenting graphs or tables. My part of this project was to design the hardware to do all this. All we got from the hackathon was an Arduino, a force sensitive resistor, and a USB cable so I had to rely heavily on my stash of parts and tools which wasn’t too bad. We would’ve been doomed though had I not had a bunch of RFID tags and a reader. Since we only had a day to do this I settled on using the Arduino and the force sensitive resistor to measure the weight. What I didn’t know was how bad it was at doing that, the granularity was basically enough to tell if there was an object on top of it or not. It also required that force be applied evenly over the surface area of a small disk. To solve this we wrapped it in a rubber band and hot glued a plate over it to serve as a demo serving vessel. The food also needed to maintain a centroid about the middle of the plate. Since the sensor was a passive resistive type, I created a voltage divider to sample the weight and calibrated it using some melon and a food scale I had lying around. Unfortunately I found that the voltage/weight relationship was also nonlinear and very imprecise, but I went ahead and made a linear approximation of it anyway just to have something to present. To clean up the noise from the data I took the mean of eight samples (Power of 2 for quick division hehe) and reported that. For communications I used the onboard USB UART chip to communicate with the host computer which is not the most elegant or scalable solution, but it’s what I had available to me. This created a problem for the ID-12 RFID reader I had which depended on UART to report the ID of the tag, but I found a nice Arduino library that bit banged a UART at the correct baud rate. For simplicity I used the RFID tag to frame the sample event so I sample the weight when the plate is first detected and again when the plate is removed. As a hack to figure out when the customer takes away their plate , I had to periodically toggle the reset pin on the ID-12 reader to trigger a new read. The moment it failed to read the plate has been removed. With that, I had all the necessary pieces to link a plate to a change in weight. All that was left was to craft a packet and dump it over the bus where a python script picks it up and pushes it up to the database on the cloud.

I managed to finish up the hardware sometime around 2AM and decided to go home for some sleep. The next morning we did integration and documentation which was admittedly very rushed, but I guess that’s the spirit of the hackathon. To my surprise, everything just worked after integration thanks to our well-defined hardware/software interface. We just replaced their test unit with the unit I completed last night and the entire stack worked perfectly. It was just incredibly satisfying to see the tables and graphs update immediately after we transfer food from the serving vessel to our demo plate. After that, we developed our presentation materials and pitch. Of course when it came to demo the product we had technical issues moving from the development environment to production, something about the wi-fi in the room failing and we couldn’t connect to the database on the cloud. In hindsight we probably should have just ran this all locally, but oh well the judges ended up loving the idea. It just met the requirements of the hackathon really well in that our solution provided clean quantitative data, representative of a population, and completely non-obtrusive.

By the time we’d presented to the audience and it was time to announce the winners we were teeming with anticipation as they kept announcing teams that weren’t ours. I really didn’t expect us take first place at all given that I really had no idea what I was doing at a hackathon, but I was obviously thrilled about the outcome. We ended up with $1500 split three ways, which was a nice little bonus. Would I do it again? Probably not. It’s pretty exhausting and stressful while the per-hour value for the prize money was pretty low. I’m just glad I can finally cross this off the bucket list and ended up with an overall positive experience. I met some great people (Big thanks to Anandh and Arjun!), developed a novel idea, won some cash, and got a neat story out of it. Hackathons are pretty cool, but not for me at this point in my life.

Links:

My Attempt at EMMC Data Recovery

This slideshow requires JavaScript.

So here’s a fun little story and interesting side project rolled up into one post. It all started when my girlfriend and I were out one evening to see Distant Worlds at Boston Symphony Hall for an orchestral rendering of some Final Fantasy classics. As the clumsy person I am, I managed to land my bum on her phone and snap it in half -_-; rendering it completely unusable. At that point I iterated through all the possibilities in my head to defuse the situation and probably offered them all up in the span of one minute. Offer to replace it? Apologize profusely? Maybe Google backed everything up for her? Give her my phone? Recover the data myself? Needless to say it wasn’t a fun few days. I ended up lending her my phone while we sorted this out, figured that I don’t use it nearly as much. The transition was as simple as doing a NAND backup and leaving her with a clean phone to do whatever with. I did relish in the idea of recovering the data from the EMMC chip since I knew such a project was well within my abilities and quickly turned my attention to it.

I was confident in my baseline knowledge at the onset that I had built up the technical know how to carry this out. At the university I’ve worked with EMMCs with GumStix boards and knew that they looked exactly like an SD card from a software perspective. On one of our products at work, I know that electrically it looks like a BGA version of an MMC card on the schematic. Past curiosities in hardware reverse engineering has taught me that one of the first things people do is dump the data on the EMMC by soldering wires directly to the data pins. From Cyber Security, I learned enough about disk forensics to recover the data once I dd’d the image by recovering the file system and/or scalping any useful files.

The basic approach I formulated was the following:
1) Information gathering. Learn as much about the hardware as possible, find pinouts, understand the EMMC protocol, how it compares to traditional SD, similar works, soldering techniques for BGA pins, anything that fills in the gaps of my knowledge.
2) Desolder the chip with a hot air gun.
3) Solder wires directly to the BGA pins on the EMMC and solder them to the corresponding pins on a MicroSD card adapter.
4) Plug the MicroSD adapter into a standard SD card reader.
5) Immediately dd the /dev/sdx device into an image file and discard the physical media.
6) Mount the image file and recover files from file system (repair as necessary).
7) For fun, recover deleted files using scalping tools 😉

I think my greatest difficulty in this attempt was in finding high quality technical resources online. There’s just far too much crap on the internet and a lot of attention is given to explaining these concepts to the lay man. Anything more sophisticated than “sending it in to an expert” or “accepting your losses” is quickly dismissed as impossible. GSM-Forum was probably the best source I found for this kind of information. Their focus was more on mobile phone repair, than data recovery but I was able to find good information on performing reworks on EMMCs.

Unfortunately, I couldn’t find test points for her particular phone so I had to remove the BGA. This wasn’t too difficult, just blow hot air on it and lever it off with a blade. The annoying bit was the glue that they put under the chip which left a bit of a mess to clean up on the underside of the BGA. This consisted first of a rubbing alcohol sweep. Then a lot of flux and solder to get a nice soupy surface of molten metal on the package. This could then be swept up with my iron and some desoldering braid. A final rub down with alcohol and we have a nice clean surface.

Since these EMMC’s are jelly bean parts the pinout is pretty much standardized for each package regardless of the vendor. It was just a matter of me carefully laying down my wire on the solder tiny BGA pins. I managed to solder wires to all the pins, but ended up lifting the clk pin when soldering the chip to the micro SD adapter. The project was pretty much toast afterwards since I didn’t want to go through the effort of dremeling into the package to expose some more metal.

In hindsight, one step I regret not taking was to try to power on the board and try to recover the data through the Android Debug Bridge directly. Not once did I consider the chance that the logic board was still functional and probably could’ve saved me a lot of trouble. Anyway, it was a fun little project to pursue which didn’t have the outcome I desired, but I learned a lot from it. If the opportunity ever arises again, I’ll be prepared to succeed. 🙂