Design and Circuitry for a Smart Brake Light

Biking in New York often means traveling through highly congested areas. Cyclists zip between cars on the road and contend with congested bike paths. The path is fraught with obstacles. One fear I have is an unexpected need to brake quickly that causes an accident with the person behind me. To minimize this risk, I decided to build a brake light that responds to the bike’s movements. Similarly to a car, the light should turn brighter red when braking. This post will explore the requirements and circuit design for the brake light with a focus on how to choose components that work together.

A brake light can be designed in many different ways. To constrain the problem, I’ll present my requirements for such a light. First, I don’t want a complex installation. The light should simply attach to a bag, seat post, helmet or clothing. I decided that connecting to the brake pad to detect braking was too involved (this requirement made the problem much more complicated). Also, upon installation, the light could be oriented in any position; it should operate properly if facing forwards, backwards, or any other direction. Third, the light should be able to detect brake events as distinct from noise, and the algorithm should generally work for any vehicle, not just bicycles. Fourth, the light must operate on battery power for at least an hour, and it should not be too heavy or complicated to recharge. Last, I assumed the device is energy efficient and designed for a resource constrained environment.

My solution to these requirements was to detect brake events using an accelerometer. Specifically, a microcontroller can process accelerometer readings to decide how to drive an LED. If the accelerometer readings suggest deceleration in the forward direction, the light should turn brighter red. In this circuit, the microcontroller reads data, processes it and operates an LED driver. This post presents a circuit design for this device.

Smart Brake Light Circuit Diagram

One caveat and note: Because I am not an electrical engineer, I heavily used these two sites for instruction: PCBHeaven article explaining various constant current drivers (link to page 4) and an Instructable implementing a constant current circuit. These links are excellent resources, but they missed some details. This post covers some details not well explained in those links, and focuses more specifically on how to choose compatible components for the circuit. I recommend reviewing the PCBHeaven article before reading this article.

Firstly, how does the circuit work? We can simplify the circuit into two conceptual pieces: a current limiting circuit on the right, and logic controlling it on the left. The current limiting portion of this circuit primarily contains 2 transistors, $Q_1$ and $Q_2$, and 2 resistors, $R_1$ and $R_2$. The PCBHeaven article describes quite nicely how these components work together: (I tweaked variable names R1 and R2)

When power is applied, the gate resistor R1 turns on the MOSFET. This
allows current to run through the LED, the MOSFET and the sensing resistor
R2. As current increases, the voltage drop across R2 is increased as well.
When this voltage drop reaches the base-emitter voltage of the transistor,
the transistor is turned on. This will pull the MOSFET's gate
to the ground turning it off. Therefore, the current through the LED is
regulated to the one defined by the resistor R2.

The other portion of this circuit is fairly straightforward. The IC controls $R_1$ via a digital pin. The LED toggles on when the pin changes high, and off when the pin goes low. The voltage regulator helps to stabilize the accelerometer values. Finally, I inserted an optional boost converter beside the LED. This is probably unnecessary and inefficient, but I added it because in one experiment I used a 9V LED with a 3.6V battery. I think the boost converter will help stabilize LED brightness as the voltage in the battery drains, but I have not tested this hypothesis.

With this basic understanding of the circuit in mind, we can begin to define requirements of this circuit’s design and show how choices of component specifications affect these requirements. Hopefully my description here will help you better understand the circuitry. I found this exercise a worthwhile way to simultaneously learn and test my understanding of the circuitry.

We can start our goal to choose components for the circuit by choosing the LED we want to use, say, the Cree XP-E Red LED (datasheet). The datasheet specifies that the LED has a forward voltage of $V_{LED} = 2.3V$ at current $I_{LED} = 700mA$, and this is the maximum current rating for the red LED. Based on the datasheet, then, our target current for this particular color should be less than or equal to 700mA. The $I_{LED} = 0.7A$ will be useful later.

Independently of our LED choice, we can also choose an appropriate $Q_1$ transistor. Luckily, most common NPN transistors are appropriate. The key specification here is that the transistor’s base-emitter saturation threshold should be some small (but not too small) value. It has to be small enough to guarantee that the voltage potential at the MOSFET source is greater than the transistor’s base-emitter voltage ($V_{Q_{1BE}}$) but less than the transistor’s absolute maximum $V_{BE}$, as defined in its datasheet. A small value also has the advantage that less power is lost through $R_3$. A very useful way to think about this is with the following equation: . This equation will appear again during MOSFET and battery discussions. Let’s chose the transistor 2n3904 (datasheet), which has $V_{Q_{1BE}} = 0.65V$.

Having chosen a $Q_1$ transistor with known $V_{Q_{1BE}}$ as well as a max current, $I_{LED}$, we can then choose the appropriate $R_3$. As a reminder, $R_3$ determines the maximum current through the LED. So how can we choose the appropriate resistor? The key idea in this circuit is that $V_{Q_{1BE}}$ determines the voltage drop across $R_3$, or: $V_{R_{3}} == V_{Q_{1BE}}$. We can therefore derive that an ideal value for $R_3 = V_{R_{3}} / I_{R_{3}} = 0.65 / 0.7 = 0.92 \text{Ohms}$. If we used a 1 ohm resistor, which is more common than .92 ohms, the current through the LED would 0.65A (ie $0.65 / 1$). Therefore, we will set $R_3$ equal to a 1ohm resistor. The next consideration for this resistor is its power rating. Since, $P = I * V = 0.65 * .65 = 0.42W$, we need a resistor rated for at least 1/2 watt; let’s round up to 1 watt. In summary, $R_3$ is a 1 ohm, 1 watt resistor.

At this point, we’ve identified the LED, $Q_1$ and $R_2$ and we can now select the MOSFET, or $Q_2$. The MOSFET is a transistor that uses a relatively small current to control a much larger current load. I found two requirements that determine if a MOSFET is usable in this circuit. The first requirement is that the maximum activation threshold, or voltage between gate and source ($V_{GS}$ in datasheets), must be less than the high voltage emitted by a pin on the IC. If we assume the IC runs at 3.3V, then the threshold should be smaller than that. The second requirement is that we respect the absolute maximum rating of its drain to source voltage and also ensure we don’t overheat the component. It turns out that this has more to do with our choice of battery; a higher voltage battery results in more power (and heat) lost in the MOSFET. I’ll explain why this is the case.

The MOSFET in our circuit guarantees that the voltage drop between its drain and source ($V_{Q_{2DS}}$) is sufficiently large to satisfy this equation we saw before: . You might notice that, with $V_{LED}$ and $V_{Q_{1BE}}$ fixed, this essentially comes down to choosing a battery voltage very close to $V_{LED} + V_{Q_{1BE}}$. For instance, if we assumed that the battery voltage, $V_{battery}$ changes between 3.6V and 4.2V, and we plug in the other previously determined values, then: $0.65 \leq V_{Q_{2DS}} \leq 1.25$. Clearly, a higher battery voltage results in more power lost in the MOSFET. We can evaluate the worst case power draw from this component for the given max battery voltage as $P_{Q_{2}} = I * V = I_{LED} * V_{Q_{2DS}} = 0.65 * 1.25 = 0.8$ watts. And finally, we can estimate the temperature of the MOSFET by plugging its rate of heat dissipation, $R_t$ into the equation $\text{Temp} = P * R_t = 0.8 * R_t$. The MOSFET’s datasheet will give us an idea whether this temperature will require a heatsink. I chose to use the IRLB8743 MOSFET (datasheet). Without a heatsink, its $R_t = 62$ degrees C per watt, resulting in a very acceptable temperature of 50.375 Celsius.

As previously mentioned, the battery’s voltage should be just large enough to power the IC and LED, but no larger because excess voltage is lost as heat in the MOSFET. I used 3.6V rechargeable Lithium Ion batteries.

Last but not least for the constant current components, we can choose almost arbitrarily an $R_1$ resistor with a relatively high resistance. When choosing a resistor value, we simply want the maximum resistance that still provides enough voltage potential to drive the MOSFET. The PCBHeaven article suggests .8 to 1mA of current. For instance, if the IC pin voltage is 3.3V and the MOSFET gate-source voltage is 2.35, we can calculate a desired $R_1 = (V_{\text{IC_pin}} - V_{Q_{2GS}}) / .001 = 950 \text{ohms}$. In practice, I used a 10kOhm resistor and had no problems.

At this point, we have discussed all components except the IC, accelerometer and voltage regulator. I will only briefly cover my choices for these, as the circuitry for them is simple. The IC is an ATTiny85; the choice of microcontroller doesn’t really matter, though it’s worth mentioning that the chip works with the chosen battery voltage. The accelerometer, ADXL335 (datasheet), I found readily available and conveniently soldered onto a breakout board. The version I purchased was unregulated, and I added a low dropout (LDO) linear regulator, AP7365 (datasheet). This regulator’s datasheet requires two 1uF capacitors between IN-GND and OUT-GND. If the battery voltage is significantly far from the IC and accelerometer voltage requirements, this kind of linear regulator may shed too much power or overheat or temporarily disable itself.

I hope this description of the circuit was helpful and informative. It took quite a lot of research to figure out! For those interested in building the circuit, I am pasting the “cheat sheet” below:

Components for any color LED:

  • LED: Cree XLamp XP-E Red 620-630NM w/20mm Star Base
  • R1: 10k ohm (1/4W) resistor
  • R2: 1 ohm (>=1W) resistor (choose this to determine current through LED)
  • Q1: 2n3904 NPN transistor (VBE == VR3)
  • Q2: IRLB8743 MOSFET
  • Lithion Ion 3.6V battery
  • Accelerometer: ADXL335 at 3V (find one with onboard regulator)
  • Voltage Regulator: AP7365 and 2 1uF capacitors (only if accelerometer doesn’t already have onboard regulator)
  • IC: ATTiny85 (you’d need to build/buy an AVR ISP to program it)

Note: If you swap out the LED, you may also want to use a different R2. For instance, the Cree XP-E White LED could use an R2 of .68ohm to get a current of 956mA. Also, LEDs like Cree XM-L2 are brighter but battery hungry.

Useful equations mentioned in the post:

  • $V_{battery} - V_{LED} - V_{Q_{2DS}} - V_{Q_{1BE}} == 0$
  • $V_{R_{3}} == V_{Q_{1BE}}$ and therefore, $R_3 = V_{R_{3}} / I_{R_{3}}$
  • $P_{Q_{2}} = I_{LED} * V_{Q_{2DS}}$

Surveying bike light components

A simple search for bike lights on Google, at REI or in a bike shop will show that lights made today consist of LEDs with a battery, button, electronics and a reflective plastic lens. From there, things begin to vary. This post surveys the basic components of a simple, generic bike light with two goals in mind. The first goal is to introduce basic knowledge necessary to begin building a bike light: How do the basic components work at a high level? What kinds of challenges do these components introduce? Learning how these components work may provide some intuition about why so many variations of bike light exist. The second goal of this survey is to open the door for deeper and more focused future blog posts as well as inspire my ongoing research into the topic.

As the primary component of a bike light is an LED, it seems reasonable to survey the LED first. LED stands for Light Emitting Diode. Being a diode, the LED has some interesting properties. One property of a light emitting diode is that it will not emit light or sustain a noticeable current unless the voltage across its two leads exceeds some threshold. When the voltage exceeds the threshold, the diode has almost 0 resistance and allows infinite current (thereby also affecting voltage). Since higher current through the diode results in a brighter light and also in heat generated, the LED can quickly act like a short circuit and burn out or catch fire. Therefore, a mechanism must exist to limit the maximum amount of current allowed to flow through the LED when it is turned on. Check Wikipedia for details about LEDs and diodes.

The process of controlling the light output of an LED is typically referred to as “driving an LED.” There are various ways to “drive” an LED, and different LED drivers typically make choices that maximize energy efficiency and minimize circuit complexity. A driver must supply current to the LED, and secondly it might also vary the light’s brightness. A very simple circuit to drive an LED could use a resistor in line with the LED. However, if the LED draws a lot of power, it will waste a fair amount of electricity as heat in the resistor. This inefficiency is costly in a battery operated system. More complex constant current (CC) circuits use two transistors and two resistors arranged in a particular way that efficiently limits the max current through the diode. There are a variety of ways to power LEDs, and the required power electronics can be quite sophisticated! I have not found many good tutorials on constant current LED drivers, but this instructable is enlightening, and this pcbheaven.com tutorial is great. A more helpful search term is “constant current circuit.”

Dimming an LED, or changing its brightness, is also a feature of LED drivers. Somewhat surprisingly, dimming is not as simple as one might think. An inefficient way to dim an LED is to reduce the current through it. This method, known as “analog dimming,” is problematic because, as DigiKey explains, “the regulator supplying the current to the LED must soak up any power not supplied to the LED. […] That power is wasted as heat.” Due to the issues with the analog approach, a digital method has been developed called Pulse Width Modulation (PWM). PWM provides a way to represent an analog signal by controlling the duty cycle of a square wave of high frequency. In other words, a square wave oscillates between high and low voltages. When the square wave goes high, the LED turns on; the LED is off when the square wave goes low. This on-off cycle happens many times per second. In each on-off cycle, known as a wave’s period, adjusting the proportion of that period spent in the high region will adjust the perceived brightness. This proportion is known as the duty cycle, and a 70% duty cycle is brighter than a 20% duty cycle. A convenient way to think about PWM is to think of it as a “flicker” rate where the duty cycle adjusts the percent of time the square wave is high. In this context, you may notice intuitively that if the frequency of a PWM signal is low, it will flicker slowly enough that the human eye could see the individual on-off cycles. Low frequency PWM can be useful for power savings in some contexts because faster clock speeds are associated with more energy usage. High frequency PWM avoids flickering in LEDs. Therefore, to adjust brightness, we want a high frequency PWM signal with variable duty cycle. Implementing PWM in an embedded system is an elaborate topic, and several articles online are dedicated to the topic. We will dive into this in a future blog post.

Pursuing the LED topic further, it is important to recognize that there are many different kinds of LEDs, both in terms of form factor and purpose. One distinction worth pointing out is high power vs low power leds. In bike lights, front lights use high power, super bright LEDs. Cree and Luxeon Rebel are common brands, for instance. These diodes use more current (typically on the order of 0.5 to 2 amps). High power LEDs are so bright that they leave white spots in your eyes for a few minutes if you look directly at them. Because they draw so much current, they can get quite hot and are therefore mounted on different kinds heat sinks typically using a thermal epoxy. These heat sinks themselves have a variety of shapes, sizes and materials. Low power LEDs, on the other hand, have different requirements. These are more common as red tail lights. Because low power leds they they have much smaller demands on power consumption, we can often replace complex constant current circuits with resistors. These LEDs can quickly burn out though if given an incorrect voltage, and it is important to realize voltage varies quite a bit between various colors (red is probably the only useful color for a bike light). Thus, we can have high power LEDs for the front light and high or low power LEDs for the back light.

Another distinction between LEDs is their forward voltage. The forward voltage is the voltage drop across leads of the LED when the nominal current flows through it. LEDs designed for battery operated settings (such as flashlights and bike lights) tend to have a forward voltage of 3.6 volts, which is quite similar to a standard voltage for Lithium batteries. However, LEDs designed to replace household Halogen bulbs (like the gu10 variety) will expect to run off of a much higher voltage. The gu10 bulbs I tested had a forward voltage of 9V. GU10 bulbs are neat because they have perfect heat sinks for bike lights, but the higher voltage requirement makes them a appealing choice.

Finally, on the topic of LEDs, it is also useful to consider how LEDs are sold and packaged. In general, the packaging methods for an LED component are the same as they are for many electronic components. Common forms for the surface mount LEDs are as a “tape” or “roll” of tiny components that must be surface mount soldered onto a heat sink. Very small batches are sometimes sold in ziploc bags. The process of soldering a suface mounted LED onto anything is quite complicated because the diodes are sensitive to heat and moisture. Many methods for soldering exist and are quite elaborate in production settings. I generally recommend avoiding surface mount soldering (SMD) of LEDs, though if necessary, I recommend using solder paste and a hot air gun, and also watch a few instructional Youtube videos. The most common form of surface mount LED sold to hobbyists is the “star” shaped heat sink with an LED already mounted onto it. I highly recommend these as a first start. The form factor for through-hole LEDs does not provide a means to sink heat, so through-hole LEDs are low-power. These are fairly straightforward. Shopping online for LEDs is probably the best way to learn more about varieties available.

Moving on from LEDs, we can enter the world of batteries and battery chargers. For a bike light, the key considerations for a battery are its voltage, its milli-amphere hour (mAh) rating, form factor, whether it is rechargeable and its chemistry. Picking an appropriate voltage battery depends on the voltage requirements for the LED driver and other electronics. Typically, a 3.6V Lithium battery is a good choice. The mAh rating for batteries, for marketing purposes by battery vendors, is a unit of measurement that is mis-used to represent battery capacity. Milli-amphere hours technically is not a measure of the battery’s capacity, but a unit of charge that defines how much total charge the battery contains. An acceptable approximation of how many hours a battery will last is to divide its mAh value by the mA consumption. There is a relationship between mAh and voltage worth knowning. Specifically, placing batteries in serial will add their voltages but not their mAh. Conversely, batteries in parallel have a constant voltage but increase in total mAh.

In terms of battery form factor, there are several shapes and sizes of batteries, and size is often an important consideration in project design. Naming conventions for battery form factors is messy. However, for larger cylindrical Lithium-ion batteries, the naming convention is fairly straight-forward and useful. According to Wikipedia, “the larger rechargeable cells are typically assigned five-digit numbers, where the first two digits are the (approximate) diameter in millimeters, followed by the last three digits indicating the (approximate) height in tenths of millimeters.” For instance, 18650 batteries popular in LED flashlights are 18mm in diameter and 65mm long.

Battery chemistry is very complicated and I have not studied it. Chargers for battery chemistries are also quite complicated and detailed. Relevant facts Lithium batteries are that they easily explode or catch fire if incorrectly charge, they die if depleted beyond some minimum voltage, and they can emit huge amounts of current. To solve the minimum voltage problem, some batteries are “protected” meaning they have integrated circuits that will prevent current draw if the battery voltage drops below a minimum acceptable limit. Protected batteries are typically longer than unprotected batteries, and therefore this is evident in the last 3 digits of their conventional names.

Regarding batteries, my recommendation for a bike light is to stick with one of these 3 options: Use 3.6 Lithium batteries for a high power LED, and consider AA or AAA cells for low power LEDs. Battery packs are also available and decent options. If Lithium, be aware of protected vs unprotected batteries.

Finally, as this post has grown quite long, I will briefly outline the topic of buttons. Buttons are, surprisingly, also a large topic. I find the distinction between a “button” and “switch” confusing, so I will define terms. A button refers to any mechanism by which pressing it results in some action. I consider a button as mechanical interface that activates a switch. A switch is any device that can open or close a circuit. For instance, the “lock” button in a wifi-enabled door lock might be an icon on your phone. The switch is the component on the door lock that receives the wifi message to “lock door” and triggers the door lock. In general, the process for choosing a button is identify a possible design, then choose the appropriate switch for that design, next implement a prototype and iterate on these steps until happy. Different kinds of switches exist, including reed, latching, momentary, rocker and “button” switches. Read about them on Wikipedia. A bike light typically has requirements that the button be waterproof or water resistant (see Wikipedia on IP ratings). Designing a water-resistant button requires some creativity, and I’ll cover my approach to it in a future blog post.

One useful way to learn about switches is by shopping online (ie DigiKey) and reading the datasheets for the various products. There are literally tens of thousands of products available, which I find quite impressive. I would guess there are two reasons why so many switches exist.

First, switches they are cheap, easy to amass, often easy to swap out, and, I presume, easy to manufacture. When trying switches, designers do not need to think as carefully about the specifications of the switch as they do about other electric components such as transistors. The second reason I think so many products are available is that people (such as myself) naturally confuse the “button” with the “switch.” While buttons seem like simple, dumb components, a button is not a switch. The mentality that a button as a simple topic not worth spending time on goes hand in hand with an unrealistic expectation that a “ready to go” button should be available. The reality, though, is that designing a reliable button can be quite challenging as it integrates with and often requires additional mechanical parts. For instance, I recently took apart my electric shaver and noticed that the “button” component was directly integrated into the plastic body of the shaver; certainly, a company like Digikey couldn’t sell this button without advance knowledge of that particular product’s design. The switch, however, was a standard tactile switch. I argue that a switch, unlike a button, is a clearly defined and “ready to go” component. As a result of the availability of switches and the confusion between buttons and switches, people are eager to try switches that also solve both the button design problem, when what they should probably be doing is designing a button and then swapping out switches that fit into the design.

The last topic on buttons and switches I’ll bring up is that of debounce algorithms. Specifically, when a switch acts on a circuit, say to open or close a circuit, it affects the circuit in a variety of ways. Ideally, we expect a switch simply to open the circuit or close the circuit. In reality, the switch has resistance, current and voltage limits, and will exhibit complicated, probably chaotic, period of instability during state changes. This reality can pose significant problems. For instance, imagine a momentary switch controls the power of a bike light. When the button is pressed and released, the light toggles on. The problem arises in detecting what denoes a “press” event and a “release” event, and debounce algorithms specialize in solving this. Specifically, when a momentary switch is pressed, a metal contact flips into position such that it closes a circuit. During the process of flipping, the contact can close and open the circuit many times within a few milliseconds until it stabilizes in the “closed circuit” state. On release, a similar process may happen. I have not stumbled across research that explores what actually happens to the circuit during the flip, but I imagine it is quite complex and seeping with theoretical mathematics. For practical purposes, we simply need to create algorithms that identify the instability of the switch and use that period to define when the switch is explicitly “pressed” and “released.” There is an abundance of information about debounce algorithms and their implementations available online. Here’s one good article.

And that concludes our survey of bike light components! As previously stated, the primary goal of this post is to present a high level view of how the basic components of a bike light work so we can both gain insights into the types of challenges that these components introduce, and secondly, garner intuition for why different variations components exist. We discussed differe LEDs and challenges of driving them. We also discussed important factors in choosing batteries, particularly Lithium ones. We also exposed some often confused intricacies between buttons and switches. Careful readers will note that I have not covered reflective lens design and materials for lights. I have chosen to skip discussion of reflective materials for the light because I have not done much research into it. I hope this post serves as useful starting point for those undertaking similar projects. Please follow feel free to check out my other posts on this topic, by checking this project’s table of contents!

A foray into bike lights, from front to back.

Conceptually, a bike light can seem simple: a bit of wires, a battery and light bulb. But take a closer look at one. There is a lot going on in there. Voltages are changing thousands of times a second, debouncing algorithms distinguish between user interaction and noise, power electronics limit current draw through the LED. While a bike light can seem astoundingly simple, a bit of inspection reveals that the simplest light hides an incredible amount of complexity and is the result of research from many disciplines. A bike light isn’t just a useful physical object, it is a gateway to research and discovery.

I would like to share with you my ongoing project to build and explore bike lights. I will present this topic as series of blog posts. My overall goal is two-fold: First, I want to present to you a basic knowledge of electronics, probability, engineering and programming so that you may be inspired to start a project of your own. Second, I am excited to highlight some of the concepts, algorithms, and mathematics hidden behind such a deceptively simple topic. I hope you will join me in this adventure and peek into a deep and fascinating world.

The audience for these posts varies depending on the topic. Hopefully, someone generally interested in electronics, probability, software and hardware can enjoy these posts.

Here is a “table of contents” for the upcoming series. I will write these posts out of order and link to them here:

  1. Surveying bike light components:

    The series begins by introducing the components of a typical bike light and reviewing how these components work and what challenges they introduce. Think of it as a survey of relevant hardware components.

  2. Design and Circuitry of a Smart Brake Light:

    A deeper dive into the process of creating a constant current circuit to drive the LED. We will lightly review how the circuit works and use a few important equations to understand how to choose the components of the circuit.

  3. AVR programming an ATTiny85 microcontroller without Arduino

  4. PWM and debouncing on an AVR microcontroller

  5. Presenting my first 2 simple white bike light builds and why the first one burned out.

  6. An algorithm for a “smart” brake light using rolling averages, the Z score, and the binomial probability mass function.

  7. An algorithm to calculate moving averages of a timeseries in an extremely low memory environment.

  8. The problem with 3 dimensional accelerometer readings to account for road bumps, inclines, turns. Research experiments and potential next steps.

Artificially Intelligent Yogurt Maker!

In this post, we’re going to build a thermostat. Thermostats are great things to know how to build, because they have a wide variety of uses: Sous-vide, home brewing, automating kiln temperature for pottery, making yogurt, adding temperature control to a soldering iron, smart fridges, etc. The device described here can be used for all of the above.

You might be wondering what it is that makes a thermostat artificially intelligent. Temperature control problems, and control theory in general, are surprisingly complex, but the problem is intuitively simple: To make yogurt, we wish to heat a container of milk. As we heat the milk, the milk particles closest to the heat source will have a higher temperature than the rest of the liquid. We’ve just recognized the milk container as a dynamical system, where warmer particles are gradually moving towards colder particles over time. Now let’s say a temperature sensor is placed in the coldest corner of the milk container, and let’s use this sensor to determine when to turn on the heating element. Because the milk heats up unevenly, we will drastically over-heat our milk if we wait for the temperature sensor to reach the target temperature. And to complicate the problem further, the rate of heat transfer across the milk is different at different temperatures. So what can we do?

The PID controller is a thermostat (and generic control mechanism for dynamical systems) that takes into consideration three things:

  1. how far away the current measured temperature is from the target temperature
  2. the history of temperature recordings
    • How much have we been off by in the past?
  3. and a measurement of how this error is changing
    • In the next split-second, will we go further away from our target, or closer to it?

PID, which stands for proportional-integral-derivative, combines these three things as a weighted sum to identify how far away from some target variable (a target temperature) we actually are.

The $x$ values are simply weights that define how much we should value P over I or D. In our case, $x$ values characterize how heat flows through our milk container. If we can tune these values to the correct number, we will then have a clear estimate for how far away current container temperature is from our target temperature. That estimate can then directly determine what percentage of the next 2 seconds (or any arbitrarily chosen period) the heating element should be turned on for. Because the process for finding these $x$ values is laborious to determine by manual testing, people have developed several different (machine learning) algorithms to tune these parameters, and the actual process of using a PID controller is simply one of making predictions using the learned $x$ parameters.

The next section of this post is a tutorial describing how you can build your own yogurt maker. We will use a PID controller to power an electrical (GFCI) outlet using electricity from a wall outlet. You will need a few essential ingredients.

NOTE: I take no responsibility for anything you do as a consequence of these instructions. Proceed entirely at your own risk!

Once you’ve collected the parts, it’s time to put them together!
Before you start connecting components, triple check that everything is disconnected from a power source!

The wiring diagram we’ll follow should look something like this:

Wiring Schematic for PID
controller

The AC power source, which comes from a wall outlet in your home and is carried through the 3-prong plug, has 2 hot wires and 1 ground wire. One of the hot leads (it doesn’t matter which one) goes to two places: directly into the AC side of the SSR relay, and also to one of the “AC in” terminals of the PID controller. The other hot lead connects to the GFCI outlet, and also to the second “AC in” terminal on the PID controller. There are two ways to connect the wires: You can either splice the hot leads with two spare wires (of sufficient gauge) using a wire nut and then connect each end to the PID and relay, respectively. The other option is to avoid splicing wires altogether by first connecting each hot leads to its respective terminal on the relay, and then connecting the PID “AC in” directly to that same terminal on the relay. I prefer to use the splice + wire nut method, because splicing can make things much clearer. The ground (green wire, typically) connects to the GFCI ground. If you aren’t sure which 2 wires are the hot ones, that’s bad. At this point, you need to be absolutely certain which wire is ground and which ones are hot. The hardest step here is figuring out which terminals on the PID controller are the right ones. Consult your PID controller manual for which ports are the relevant ones, as different models have slightly different configurations.

Then, cut a section of spare wire. The spare wire should be rated to at least the same amperage as your outlet. The outlet listed above is rated to 15A. 18AWG wire can handle a maximum of 16A, according to this site, so that gauge should be fine for our purposes. Connect one end to the available terminal on the AC side of the relay, and the other to the available terminal on the GFCI outlet.

Third, cut two sections of spare wire. With one wire, connect the “DC out +” terminal on the PID controller to the “DC +” terminal of the relay. With the second wire, connect the PID controller’s “DC out -“ terminal to the relay’s “DC -“ terminal. Your final wiring configuration should look like the image below. For clarity, I removed the hot leads from the picture. If you chose the wire nut route, you should have one hot lead connected to each wire nut, and ground connected to the GFCI outlet. In this picture, the striped green wire is not ground.

Diagram 1

At this point, you should have wires connected to: all 4 terminals of the relay, all 3 terminals of the GFCI outlet, and 4 wires connected to the PID controller. The last remaining step for the electrical side of things is to connect the thermocouple to the PID controller. The thermocouple listed above has 3 wires, which correspond with three terminals on the PID controller. The instruction sheet on your PID controller will tell you where to connect the thermocouple.

Great! Now you should technically be able to plug the device into the wall and start using it. Double check the wiring and then plug the 3-prong plug into the wall. The PID controller should turn on. When it does, evaluate that the temperature reading seems reasonably accurate, and that it changes when you touch the thermocouple. Then, adjust some of the target temperature settings until you see the light on the relay (or your GFCI outlet, if it has one) turn on. Once you pass this step, you know that everything is wired correctly.

The only remaining steps here are to mount the heat sink on the relay using thermal compound and a couple screws (if the heat sink has screw holes, which it should), and then to make your enclosure box. Applying thermal paste is fairly simple: Clean the relevant heat sink surface (and relay bottom) with alcohol. Dab a drop of thermal paste on the heat sink. Use a credit card to spread it evenly into a thin layer. When the layer looks evenly distributed, plop the relay on top of it and screw it onto the heat sink. See a video of this on youtube:

The enclosure box can be a pain in the ass if you don’t have the proper tools. I’ll leave the design of that to you, since you will probably have different dimensions than I do. I recently built my own 3d printer, so I may make my own enclosure the day I decide to build my own PID controller using an Arduino… Otherwise, here’s my final product!

For a yogurt maker, simply connect a heating element (such as a crock pot) to the GFCI outlet, fill the crock pot with milk and a scoop of (non-honey flavored) yogurt, set the target temperature to 100 degrees Farenheit, and wait 5-7 hours. In the last year, I’ve had really good success around 95 to 108 degrees F. The lower end of the temperature range makes for a smoother, Wallaby-like yogurt, while brewing at the higher end results in chunkier yogurt. The amount of time it’s brewing for defines how sour tasting the whey will be.

Yogurt Maker

Probability Mass Function of a Binomial Distribution (PMF)

This post is part 2 of my previous one: Frequently Ocurring Pairs and the n-choose-k-Problem.

In part 1, we derived an equation for ${n\choose k}$, or the number of k-element combinations from a group of size $n$ by thinking about permutations and combinations. In this post, we’ll use this knowledge and some probability to identify which specific pairs occur more than $K$ times in our collection of size $n$. A key difference between this and last post is that previously we knew only the number of elements in our data set. In this post, our input is a collection of $n$ elements.

In the earlier post, we figured out the number of $k$-element pairs in a set of size $n$:

In this post, we’ll cover the probability mass function, where K is the number of occurrences of a particular pair (of any size), and N is the number of observations we make:

Note that the similarity between variables $k$, $K$, $n$ and $N$ may be confusing, but I am trying maintain similarity with the notation you might find on Wikipedia. Specifically, $n$ individual elements are grouped into k-element pairings. The collection of all k-element pairs that can be made from $n$ elements has a size of $N$ and the k-element pairs can be grouped into arbitrary sizes, $K$.

For the sake of this post, let’s assume that the occurrence of any pair is independent of the occurrence of any other. In other words, occurrence of one event (ie the k-element pair) has absolutely no influence on the occurrence of any other event (ie any other k-element pair), and this applies to all events in our space (of size $N$). This assumption leads us to the very useful insight that we don’t have to care about what other k-element pairs exist (or don’t exist) when questioning whether a particular k-element pair exists.

So how do we know whether a k-element pair exists? In other words, what is the probability of a k-element pair? Imagine we have a set of $N$ unique elements, and then pop a pair out of the set. In this case, we could say that there is a $1/N$ chance that our pair is equal to any particular pair in the set that we’ve already chosen. Now let’s complicate things by saying that our collection of $N$ elements is no longer unique. If we were to pop one k-element pair out of the set, we certainly could not say that there is a $1/N$ chance that our pair is equal to some particular pair in the collection, because we don’t know what fraction of the collection that any particular pair represents. Rewording this a little, we can say that the probability of pulling one particular pair in the set depends on how many duplicates of that particular pair exist in the collection, or more specifically,

The individual probabilities of each duplicate pair are the same because we’ve assumed that the occurrence of any pair is completely independent of the other. If we considered conditional dependence, each $p(\text{ith dup pair})$ would depend on other elements and this blog post is probably irrelevant unless $n$ is very large.

As you can see above, we’ve defined the probability of pulling a particular k-element pair from the collection of pairs. If we counted the number of occurrences of each pair, we’d be able to find the probability of each pair. In this case, the most likely pair would be the most frequently occurring one, which basically solves the original task to find the most frequently occurring pairs. However, this wouldn’t give a clear minimum number to “most frequent.” In our case, we want to be able to say, “which k-element pairs occur at least $K$ times with high probability?”

At this point, we have two important problems. As we saw in the previous post, it quickly becomes unfeasible to enumerate all possible k-element pairs in a larger collection because the number of pairs will skyrocket as the number of distinct elements in the collection increases (the n choose k problem). This problem is a sampling problem, and that is outside of our scope here. However, we can imagine (as an exercise for another blog post, perhaps) that we’ve sampled a subset of our collection to choose relevant k-element pairs and assign each pair a probability. Second, we haven’t figured out the probability that an event (k-element pair) will occur exactly $K$ times in a collection of individual elements. We will solve the second problem and use our new knowledge to find pairs that occur more than K times.

So moving on, let’s figure out the probability that an event, or a particular k-element pair, will occur exactly once ($K = 1$) and only on the first trial. This means that the event will happen the first time and it won’t happen ever again. (This also implies that we know how many events there are, but we’ll go over that later). If an “event” is defined as the presence of a k-element pair, then the probability that it occurs once is the probability of having the pair, $p(\text{pair})$, and the probability of not having a pair, $p(\text{not pair})$, for all remaining events, $p(\text{not pair}) ** (\text{num events} - 1)$. Combining this into one mathematical statement:

What if we wanted the event to occur on only the first and second trial?

We can abstract the pattern and note that $p(\text{not pair}) = 1 - p(\text{pair})$:

What we have just created is an equation for the probability that a particular k-element pair will occur on the the first trial and never again. This means that if we had a collection of pairs, we can determine the (very small) probability that we’ll see a particular pair once in our observations, and it will be precisely the first pair we observed. This is pretty cool but entirely useless to us as is. What we really want is to find the probability that a pair occurs once in all our observations, but we don’t care if it appears first or last, and this is deceptively simple:

Imagine that we could enumerate all k-element pairs in a collection of $n$ elements, and therefore the size, $N$, of that enumerated collection is $N = {n \choose k}$. The first time around, we look at $N$ k-element pairs and the first pair is $X$. Then, we look at the same $N$ k-element pairs again, and $X$ is the second pair. Imagine we do this $N$ times and see $X$ appear exactly once in each of the $N$ possible positions. In this case, the probability that $X$ occurs exactly once (ie $K = 1$) anywhere in the set is:

Now we’re onto something! If we were to expand this out to the pair happening exactly twice ($K=2$), we’d suddenly have a much larger, but very similar problem. In fact, we’d have an $N$ choose $K=2$ possible number of scenarios.

The count of terms separated by $\text{ OR }$ in the above expression is equal to the number of $K$ groupings (1st and 2nd, 1nd and 3rd, 2nd and 3rd, etc) that exist in the set of all $N$ pairs, or specifically, $N \choose K$ pairs. (Remember: $N = {n \choose k}$, so we can also write ${n\choose k}\choose K$). Knowing this, we can simply say that:

Great, we’ve just recreated the probability mass function for a binomial distribution! If we set, say, K=10 and plug in a $p(\text{pair})$ value, we will know the likelihood that a pair will occur exactly 10 times. If we repeat with K=11, we’ll see the probability either increase or decrease. If the pmf value increases when K does, then we know that the most likely frequency for the pair is greater than 10, and otherwise, we’re most likely to see the pair 10 or fewer times. We can perform this comparison for the pairs we have a hunch are most frequent (say we sampled a random subset of the input data set and assigned a probability to the relevant found pairs) and filter from those the ones whose pmf value is increasing between K=10 and K=11. In other words, we want only the k-element pairs that are most likely to appear more than 10 times. And there you have it!

Frequently Ocurring Pairs and the N choose k problem

This post explores what ${N\choose k}$ means.

I was tasked a few days ago to identify the pairs of elements that occur in at least X lists data. So how could we go about doing this?

Perhaps the more intuitive implementation (and computationally problematic) would be to find all possible pairs of elements and make a running tally of how many times each pair has occurred. But for large data, this could be a problem for the following reason:

Supposing each list in the data contained 100 elements, there would be ${100\choose 2}$ different possible combinations. ${100\choose 2}$ is a spiffy way of asking, “How many 2-element pairs can I make with these 100 elements,” or perhaps more directly, “How many 2-element combinations can I make with 100 elements?” In math terms, you say there are “N choose k,” or “100 choose 2” combinations.

Thinking about this, you may have realized that for a large number of elements, the number of pairs gets large quickly. Lets take a look at what ${N\choose k}$ means mathematically (…I just got LaTeX working with markdown, so bear with me!)

Let’s break down what this means. Taking a permutation is like grouping the elements in different orderings. So, an array like, $[a,b,c,d]$, would have these 3-element pairs (I chose 3 because it’s larger than 2). Notice that if we only looked at the first 2 elements, we see that each pair occurs multiple times. In other words, if we group the (k+i)-element permutations into k-element permutations, we have lots of duplicates!

$$ \begin{array}{}\\\ abc & abd \\\ acb & acd \\\ adb & adc \\\ bac & bad \\\ bca & bcd \\\ bda & bdc \\\ \ldots \\\ \end{array} $$ $\overrightarrow{}$ $ \begin{array}{}\\\ ab \\\ ac \\\ ad \\\ ba \\\ bc \\\ bd \\\ \ldots \\\ \end{array}$

This idea of grouping elements is cool, because now we can see that 2-element permutations are a subset of 3-element permutations. It also happens that the $(n-k)!$ term in the denominator of the “n choose k” formula above deduplicates these permutations. Let’s see how. First, look at total possible number of permutations (ie k=n, or “n-element permutations”). Second, find the number of 2-element permutations. Last, we can generalize a formula for finding the number of k-element permutations

  1. Finding the number of permutations for an array of size n=4: Conceptually, I like to think of this as recursively counting all the groups and groups of groups (etc) of elements. What I mean by this is that if have our 4 element array, there would be a maximum of 4 1-element structures ($[a], [b], [c], [d]$), each of which expands to 3 2-element structures (ie for $a$: $[ab, ac, ad]$), each of which expands to 2 3-element structures (ie for $ab$: $[abc, abd]$), etc. Applying this concept mathematically, the initial array has 4 (1-letter) groups, each of which expands to 3 2-letter groups. Therefore, the total number of groups is $4 * 3 = 12$. Each of those 3 groups further expands into 2 3-letter subgroups, giving us $4 * 3 * 2 * 1$. Therefore, the number of permutations for an array of size n is $n!$.

  2. By the logic above, finding the number of 2-element permutations in an array of size n=4 is super simple! Each of 1-element structures expands 3 2-element structures, giving us a total of $4 * 3 = 12$ two element structures.

  3. Generalizing steps 1 and 2, we can now find the number of k-element permutations in an array of size n!

There’s one more step - the conversion of permutations to combinations. Looking back all the 2 element permutations, we see that $ab$ and $ba$ are distinct pairs. Permutations care about order. Combinations, on the other hand, don’t; $ab$ and $ba$ are the same thing. In much the same way that we de-duplicated our permutations, now we must de-duplicate again. The logic you use here is very similar. Basically, the number of duplicate combinations for all k-element permutations in an array is the number of permutations for an array of length k.

At the beginning of this post, I identified the N choose k problem as a reason why certain computations were inefficient. To that point, check out how quickly the number of 2-element combinations increases for different sizes of n!

N Num combinations
1004950
20019900
30044850

For large n, the number of 2-element pairs we’d end up counting would be impractically large. So large, in fact, that counting all those pairs is eventually impractical to fit in, say 4 or 8 gb of ram. So how can we count frequently occurring k-element pairs? In the next blog post, we’ll explore answers to this question by learning about the Probability Mass Function of a binomial distribution and doing some probability! See you then.

Standardizing buckets with Poisson

So I did a fun project a little while ago at work using a Poisson regression to standardize a bunch of data buckets, and decided that I should share it in the interest of solidifying my own understanding of the concepts, and (hopefully) helping someone who may also find this topic exciting. wikipedia Poisson distribution and wikipedia: Poisson regression.

The problem: We have 1.7 million ads targeting users of different ages. So for example, one age bucket targets 15 to 20 year olds, and other might target 16 to 18 year olds. Each ad also has performance information, such as the number of clicks or impressions it received. The task is to break these non-standard age buckets into standardized age ranges and estimate any given feature accordingly. This is a perfect opportunity for the Poisson distribution! Mathematically, the task is to figure out how to describe the distribution of a desired bucket by examining all of the buckets from the data set that included this desired bucket’s age range.

ie.  Go from this:  22-26 year olds -> 10 clicks, 23-25 year olds -> 4 clicks, ...
     To this:       22-23 year olds -> 3 clicks, 24-25 year olds -> 5 clicks, ...

More specifically, each relevant bucket - or relevant ad - has a probability mass function (ie fancy words meaning it’s just a histogram) where each discrete age in the bucket’s age range has a specific number of clicks or impressions that can be drawn from the poisson distribution for that ad. This sampling step is the interesting part mathematically, because we can define it as a joint probability and do some transformations that give us computer-friendly numerical operations. Let me show you:

p(ageY;bucketX) = poisson(ageY|bucketX) = lambda**ageY * e**(-lambda) / ageY!
where lambda = mean of the distribution

and

for all relevant buckets, for all ages:
  desired_bucket_distribution = p(age1;bucket1) * p(age1;bucket2) * 
                                p(age2;bucket1) * p(age2;bucket2) * ... 

We take the log of everything, because there’s no way in hell a computer can handle multiplying thousands of small probabilities. (Log is great because it lets us add terms together, rather than multiply)

log(desired_bucket_distribution) = log(p(ageY;bucketX)) + ... 
                                 = log(lambda**ageY * e**(-lambda) / ageY!) + ...
                                 = ageY * log(lambda) - lambda - log(ageY!) + ...

also, notice that we’ve isolated the factorial term, log(ageY!), from any of the lambdas, and that’s also superhelpful, because according to wikipedia we can drop log(ageY!) from the equation. I don’t understand the reasoning, but wikipedia says because we are only trying to find the best value for lambda, and since log(ageY!) doesn’t have a lambda component, we can drop the term.

                                 = ageY * log(lambda) - lambda + ...

Lastly, by summing these probability mass functions for all relevant ads and their age ranges, we define a new probability distribution (technically a log distribution) that defines our new desired bucket. Performing gradient ascent on this distribution will ideally get you the log of the mean of the distribution and solves the original task of predicting a value for each desired age range.

Isn’t that cool?!

Monary+Mongo+Pandas = :)

A lot of people such as myself waste time getting mongo data into numpy or pandas data structures.

You could do it using pymongo. The general process would be to initialize the pymongo driver and make a query, wait for pymongo to convert stuff into lists of son (bson) objects (aka dictionaries), parse the data into arrays, and then copy it into some numpy array. But work’s been done for you already, so why do it again?

Thanks to djcbeach, we have a nifty little module that utilizes mongo’s C driver, the bson C library and python’s ctypes module to load data directly into numpy arrays. Its fast and easy! From there, we can pass this into Wes McKinney’s pandas dataframe and be very, very happy.

Lets look into this, shall we?

  1. Assuming you have numpy already and a mongo server, install Monary. Dont use pip, because the module isn’t even in the cheeseshop.
        $ hg clone ssh://hg@bitbucket.org/djcbeach/monary ./monary
        $ cd ./monary && python setup.py install
  1. Make a db conneciton
        $ python
        >>> from monary import Monary
        >>> mon = Monary() # connection to localhost
  1. Make a query and receive numpy arrays
        >>> columns = ['field1', 'field2', 'field3']
        >>> numpy_arrays = mon.query('mydb', 
                        'mycollection', 
                        {'field1':'query_stuff'},
                        columns, 
                        ['int32', 'date', 'string:20'])
  1. For ultimate happiness, pass this into a pandas DataFrame (assuming you’ve also installed pandas)
        >>> import numpy
        >>> import pandas
        >>> df = numpy.matrix(arrs).transpose() 
        >>> df = pandas.DataFrame(df, columns=columns)

I don’t think I could safely do a benchmark comparison to pymongo and not feel stupid about it, but if you were interested in seeing where this process spends most time, check this out:

I put the above code into a function called get_tu which populates 5 columns each with 1,200,000 rows of data (non NAN), and most of the ~2 seconds it took was waiting on mongo. (FYI - I’m using mongo 2.1.3-pre, returns data for this query ~.5 seconds faster than the current stable version of mongo)

        In [53]: %prun -s cumulative main.get_tu()


   342 function calls in 2.221 seconds                                                                               

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.013    0.013    2.221    2.221 <string>:1(<module>)
        1    0.000    0.000    2.208    2.208 main.py:29(get_tu)
        1    1.569    1.569    1.588    1.588 monary.py:283(query)
       14    0.616    0.044    0.616    0.044 {numpy.core.multiarray.array}
        1    0.000    0.000    0.616    0.616 defmatrix.py:233(__new__)
        1    0.000    0.000    0.018    0.018 monary.py:231(_make_column_data)
        5    0.014    0.003    0.014    0.003 {numpy.core.multiarray.zeros}
        1    0.000    0.000    0.004    0.004 frame.py:323(__init__)
        5    0.000    0.000    0.004    0.001 numeric.py:1791(ones)
        5    0.004    0.001    0.004    0.001 {method 'fill' of 'numpy.ndarray' objects}

Automatically activate your virtualenv project

Virtualenv and Virtualenvwrapper are fantastic tools. But have you ever wondered how to automatically load the environment when you cd into your project’s directory? I bet you’ve probably crawled the web and found solutions like aliasing “cd” or making some sort of bash function. But really, it should be easier.

So have you heard about ruby’s rvm? Besides being a ruby version manager, it lets you create directory-specific “.rvmrc” files which execute every time you cd into a directory.

If you don’t have rvm already, install it and learn ruby. Then try this when you decide you like python better (hah!):

    proj_name=PROJECT_NAME
    if [ "${VIRTUAL_ENV##*/}" != "$proj_name" ]
    then
      workon "$proj_name"
    fi

Ubuntu 11.10 Nvidia Thinkpad W520

There’s a lot of stuff on the web about how to setup your nvidia card with ubuntu. After a lot of work, I am able to switch between single and dual monitor setup with an nVidia Quadro 1000M video card on my Lenovo Thinkpad w520 running Ubuntu 11.10. The following is an instruction list that worked for me.

  • BIOS: set to discrete and disable optimus detection

  • Backup:

        cp /etc/X11/xorg.conf /etc/X11/xorg.conf.original
  • OPTIONAL? I did this, but not sure if any of it is necessary
        sudo add-apt-repository ppa:ubuntu-x-swat/x-updates
        apt-get --purge remove ALL_THINGS_NVIDIA
        apt-get install --reinstall nvidia-173 nvidia-173-updates nvidia-settings
        apt-get install --reinstall  nvidia-current nvidia-current-updates # necessary?
  • Run “sudo nvidia-xconfig” to replace the current xorg.conf file. Alternatively, you could use my config file, available here

  • Edit /etc/X11/xorg.conf and add the following to “Device” section

        BusId "PCI:1:0:0"
        Option         "RegistryDwords" "EnableBrightnessControl=1"
        Option         "NoLogo" "True"
* Where the BusId bit is determined by the output of:
        $ lspci | grep -i nvidia | grep -i VGA | cut -d " " -f 1
  • Add the following to /etc/modprobe.d/blacklist.conf
        blacklist vga16fb
        blacklist nouveau
        blacklist rivafb
        blacklist nvidiafb
        blacklist rivatv
  • Reboot, and if everything works, I highly recommend saving the xorg.conf file somewhere (like in a git repo)

Notes:

  • If you want dual monitors, I HIGHLY recommend disper.
        $ apt-add-repository 'deb http://ppa.launchpad.net/disper-dev/ppa/ubuntu YOUR_UBUNTU_VERSION_HERE main'
        $ apt-get install disper
  • If disper doesnt work (it will), you could do it the uglier way: run “nvidia-settings” and click the “detect monitors” button. Click “apply.” (Don’t “save” the settings because you’ll probably overwrite the xorg.conf file that has your customizations in it.)

  • If you get a black screen on reboot and want to try hacking around, don’t forget about these:
    • Hitting <CTRL><ALT><F1> to login to a terminal without X
    • Checkout man pages for xrandr, startx and X (though these may not work properly with the nvidia drivers)
  • If your hosed and want to undo everything, change the BIOS back to its original settings (probably Integrated Graphics) and replace (or remove) the xorg.conf file
        $ cp /etc/X11/xorg.conf.original /etc/X11/xorg.conf
        ## Don't forget to unblacklist the stuff in blacklist.conf! ##
        $ sudo reboot
        ## Did you change the BIOS?

Excel to CSV easy solution

Converting an excel doc to a csv format that Mac OSX can be a pain in the ass every once in a while. For those of you who have issues, I have found the solution!

The Problem:

Line parsing. Mac apps (like Excel and Textmate, etc) read and write line endings as carriage returns. Terminal/unix programs read and write line endings as line feeds.

Line feed is represented as: \n 10 (ascii ordinal value) 0xa (hex)

Carriage return is represented as: \r 13 (ascii ordinal value) 0xd

The Solutions:

Luckily, this isn’t that complicated. Basically, you need to convert \r to \n, and there a several ways to do this.

  1. Change the shell encoding to a format that can read carriage returns, and then find and replace carriage returns
    $ LANG=en_US.ISO8859-1 tr '\r' '\n' &lt; INPUT.csv &gt; OUTPUT.csv
  1. dos2unix - a terminal program that comes with a whole set of tools to basically automate the above code

    Assuming you have dos2unix installed:

    $ dos2unix INPUT.csv > OUTPUT.csv
  1. TextMate - If you prefer GUI, you can open the csv in TextMate and “Save As” csv with “LF” line endings

Dynamic Inheritance In Python

Dynamic Inheritance is useful when you need to instantiate an object whose base class is only known at the time of instantiation.

For instance, I have a few different sequences:

    a = [1,2,3]
    print type(a)

    b = (4,5,6)
    print type(b)

    c = {7:1, 8:1, 9:1}
    print type(c)

And I want to instantiate an object that inherits the same class as one of the above input sequences. Unfortunately, I can’t know which sequence I will inherit from until the program is about to instantiate my object.

Rather than answering the question outright, this post will explain how dynamic inheritance works by example, but will leave the particular implementation of this problem to the reader. So\u2026 how can one dynamically inherit a class at time of instantiation?

One option is to define a factory function and a nested class definition to dynamically build and return a class:

    def make(cls):
        """Return a NewClass that inherits from given class object (not an instance of a class)"""

        class NewClass(cls): pass
        return NewClass

    NewClass = make(list)
    print NewClass.__bases__ # the list of classes NewClass inherits from


    newclass_instance = NewClass([1,2,3])

Taking this a step further, you can return an instance that dynamically inherits a class type:

    def makeinst(cls, *args, **kwargs):
        """Return an instance of NewClass that inherits from given cls, initialized with given params"""
        class NewClass(cls): pass
        return NewClass(*args, **kwargs)

    newclass_instance = makeinst(list, (4,5,6))

Which leads us to implement a MixIn: (This means that the cls will have all its methods + the methods in the mixin at its disposal. Super useful!)

    def makeWithMixin(cls, mixin):
        class NewClass(cls, mixin): pass
        #rename class

        NewClass.__name__ = "%s_%s" % (cls.__name__, mixin.__name__)
        return NewClass

    class Car(object): pass
    truck = makeWithMixin(Car, list)

This is how you would add multiple mixins at once: (Note that the following example doesn’t work when the underlying C structure for each class type is different).

    def makeWithMixins(cls, mixins):
        for mixin in mixins:
            if mixin not in cls.__bases__:
                cls.__bases__ = (mixin,) + cls.__bases__
            else:
                print "Cannot add %s to %s" % (mixin, cls)
        return cls

    # workaround for http://bugs.python.org/issue672115

    class object(object): pass

    class Car(object): pass
    class Submarine(object): a = 1
    class Plane(object): b = 1
    mixedlist = makeWithMixins(Car, (Submarine, Plane))

Or you can pass in an instance and let your NewClass inherit the same type that the instance is.

    def makeinst2(instance):
        class NewClass(instance.__class__): pass
        # assume instance.__init__(instance.__repr__()) will work

        return NewClass(instance)

As you can see, there are several variations of saying, more or less, the same thing. The above code shows you how to mash classes together using inheritance. The approach of using a nested class inside a function isn’t the only approach. Python’s type() builtin provides really cool builtin ways of doing this:

The builtin, type, can also make class objects dynamically:

    name = 'NewClass'
    bases = (list, )
    dict = argparse.Namespace()
    NewClass = type(name, bases, dict)

    newclass_instance = NewClass([1,2,3])

For further details, you might also want to checkout the new method. And python metaprogramming

IPython Shell Integration

This post shows you how to take IPython’s sweet bash/shell integration to another level. Of particular interest is my last example, which utilizes IPython’s syntax and bash integration to create a scripting language that supports pure Python + IPython features + bash (as supported by IPython).

First off, some distinctions: IPython has aliases, macros, and magics. They are different things.
  • Aliases are shortcuts to bash commands that get piped out to the os in a subprocess. One alias is a tuple. ie. (“showmydirectory”, “ls”).
  • Macros are shortcuts to python code stored as a string. They are editable via the ‘%edit’ magic command
  • Magics are functions special to IPython that take parameters and utilize IPython internals to do helpful things. They can be identified by the ‘%’ at the beginning.
On to some code:
  1. Import everything within your path that is executable as an IPython alias. This means you can use bash commands like ‘git status’ or ‘tail -f somefilename grep something’ without explicitly using the ‘!’ to define a bash command.
        In [1]: %rehashx # rehashx is a magic function
  1. Define an IPython alias within the IPython Interpreter.
        In [2]: touch myfile
    
        In [3]: a = get_ipython()
    
        In [4]: a.alias_manager.define_alias('foo', 'ls %s')
    
        In [5]: foo myfile
        myfile
  1. Profile Customization: Automatically port bash aliases and run the magic, %rehashx, on startup.

    Add the following code to your config file

        # your config file is probably here

        $ ls ~/.config/ipython/profile_default/ipython_config.py)
Note: you can also just run this interactively in the IPython interpreter.
        # Add all of the following code to your ipython config file


        c = get_config()

        # Port bash aliases to ipython

        import os
        a = os.popen("bash -l -c 'alias'").read()
        a = a.replace('=',' ').replace('"','''"').replace("'",''"'").split('alias ')
        a =  [tuple(x.strip().split(' ', 1)) for x in a]
        c.AliasManager.user_aliases = [x for x in a if len(x) == 2]
  1. Use IPython as a system shell, and integrate bash code with python code!
        In [6]: !echo 'some data\n0 1 2 3\n4 5 6\n7 8 9' > myfile

        In [7]: !!tail myfile
        Out[7]: ['some data', '0 1 2 3', '4 5 6', '7 8 9']

        In [8]: _ #access to your output history. for details, type hist? at the ipython prompt

        Out[8]: ['some data', '0 1 2 3', '4 5 6', '7 8 9']

        In [9]: a = !ls -l # store bash output as a python variable


        In [10]: b ='from ipython' ; c = ' and me'

        In [11]: !echo 'hello $b' # use python variables in bash cmds

        hello from ipython

        In [12]: !echo 'hello ${b + c}' # embed python code into the bash code
        hello from ipython and me
Using SList, you can do all sorts of transformations on output from bash commands.
        In [14]: touch myfile # if you get a SyntaxError, perhaps you didn't use %rehashx?

        In [15]: mkdir mydir

        In [16]: a = !ls -dl myfile mydir

        In [17]: print type(a)
         class 'IPython.utils.text.SList'>



        In [18]: a.fields()

        Out[18]:

        ['drwxr-xr-x', '2', 'agaudio', 'staff', '68', 'Oct', '2', '15:38', 'mydir'],

        ['-rw-r--r--', '1', 'agaudio', 'staff', '30', 'Oct', '2', '15:35', 'myfile']]



        In [19]: a.fields(-1)

        Out[19]: ['mydir', 'myfile']



        In [20]: a.grep(r'^d') # regular expression grep

        Out[20]: ['drwxr-xr-x  2 agaudio  staff  68 Oct  2 15:38 mydir']



        # return list of filepaths for all directories returned by ls -l

        In [21]: a.grep(r'^d').fields(-1).p 

        Out[21]: [path(u'mydir')]
  1. A whole new replacement for python (and bash) scripting! Supports pure python + IPython + bash (as it works in IPython)

    Use this program to execute scripts

    An example script

    Execute script

         $ chmod +x ipyscript.py
         $ ./ipyscript example.ipy
* Note: There is also a method in IPython core that (I believe) attempts to achieve a similar effect:
            In [1]: IPython.core.interactiveshell.InteractiveShell.safe_execfile_ipy??
        
            # This can be accessed like this:

            In [2]: c = get_ipython()
    
            In [3]: c.safe_execfile_ipy('./example.ipy')

Unfortunately, it doesn’t work very well

For more tips

Checkout the notes for my talk on tmux and ipython:

        git clone https://github.com/adgaudio/Tmux-IPython-Talk.git

Or the video of my talk: HERE (python in the second half)

Solving the Vim and Comand-t segmentation fault problem

Does this apply to you? Follow ahead to resolve your issues

Problem: A seg fault occurs when using the command-t plugin for vim

    # vim works until you try to use the command-t plugin:
    $ vim
    Vim: Caught deadly signal SEGV
    Vim: Finished.
    Segmentation fault  

Reason: Command-t and vim were compiled against a different versions of ruby

Resolution: First, lets find out if you really do have vim and command-t installed.

    # Check: Is vim compiled with ruby support?
    $ vim --version |grep ruby
    +quickfix +reltime +rightleft +ruby +scrollbind +signs +smartindent -sniff
    
    # Check: Do you have command-t installed?
    ## My plugin is installed here (thanks to the pathogen plugin)
    ## your plugin may be installed in some place like ~/.vim/plugin
    $ ls -d ~/.vim/bundle/command-t
    /Users/agaudio/.vim/bundle/command-t # on my mac

    # Check: Does vim work without command-t?
    #### NOTE: your plugin path may be different
    $ mv ~/.vim/bundle/command-t ~ && vim && mv ~/command-t ~/.vim/bundle/  

    # Check: You are getting that seg fault with command-t installed, right?
    $ vim

Assuming you’ve passed those checks, lets find out whether vim and command-t use different ruby libraries

My Vim install uses this ruby library:

    $ otool -L `which vim`|grep ruby
        /opt/local/lib/libruby.dylib (compatibility version 1.8.0, current version 1.8.7)

Get the Command-t ruby library:

    $ cd ~/.vim/bundle/command-t #### NOTE: your plugin path may be different
    $ grep ^prefix `find . -name "Makefile"`
    prefix = $(DESTDIR)/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr

    ###NOTE: If you don't have a Makefile, you may need to do this first:
    $ rake make ; make 
    $ vim # maybe you've just fixed your problem?

At this point, we can see that vim was compiled with a different library than command-t. Moving forward, we have two options. First, you could compile command-t to match vim, or second, you could compile vim to match command-t. This question may depend on whether you want to use the system ruby, or in my case, the macports installation of ruby.

Below shows how to compile command-t to match the ruby lib vim was compiled with. If you want to compile vim from source, look elsewhere.

    # These are the interpreters available to me.  This will be different for you.
    $ which -a ruby
    /opt/local/bin/ruby
    /usr/bin/ruby
    /opt/local/bin/ruby

    # Find which of the above ruby interpreters matches vim's ruby library. 
    # (we used "otool -L vim" to find vim's ruby library, remember?)
    # In my case, the /opt/local/bin/ruby interpreter matches the ruby lib I'm interested in
    $ /opt/local/bin/ruby -e 'puts $:'
    /opt/local/lib/ruby/site_ruby/1.8
    /opt/local/lib/ruby/site_ruby/1.8/i686-darwin10
    /opt/local/lib/ruby/site_ruby
    /opt/local/lib/ruby/vendor_ruby/1.8
    /opt/local/lib/ruby/vendor_ruby/1.8/i686-darwin10
    /opt/local/lib/ruby/vendor_ruby
    /opt/local/lib/ruby/1.8
    /opt/local/lib/ruby/1.8/i686-darwin10
    .

Now, compile command-t with the interpreter that matches vim’s ruby lib:

    $ cd ~/.vim/bundle/command-t 
    # cd into dir containing extconf.rb (if necessary):
    $ cd `dirname \`find . -name "extconf.rb"\`` 

    $ /opt/local/bin/ruby extconf.rb 
    $ make

Test vim works

    $ vim
:)

Sharing Part of Your Repo on GitHub

I haven’t found a lot of discussion about how to share only a part of a git repo with GitHub. Here is a quick tutorial.

Let’s say you have a private repository that contains development history and files that you may need to keep private, such as proprietary code, firewall configurations, passwords or private keys. However, the repo also has a lot of files that you’d like to share on GitHub. This post explains how to create a branch containing files and history solely meant for sharing on GitHub. We will also implement a git hook and review ssh options to make pushing to GitHub easier and more secure.

Update In general, your git repos should either be entirely public or entirely private. If you find yourself wondering whether it’s possible to publicize some part of a private repo, you should probably first consider how you can split the repo into two separate ones. For instance, if you wish to open source an application while keeping your personal application configuration private, you should have two separate repositories with app configuration isolated to the private repo and application code isolated to the public repo. End of update

Assume that we have a private repo where we do all our work. The repo contains public, private, and mixed directories. Something like:

git init private_repo

cd private_repo
mkdir private public mixed
touch private/secret_pw private/proprietary private/personal 
touch public/cool_project public/useful_code public/docs 
touch mixed/a_private_file mixed/a_public_file

git add private public mixed
git commit -am "initial commit to master"

Now we have a repo in our working environment and we can consider what code we will want to display on GitHub. Specifically, we want to share our ‘public’ directory and a file in the ‘mixed’ directory on GitHub and hide the private files. In this step, we create an orphaned branch containing only the files we want to share on GitHub.

# Create a branch that will contain only our public files
git checkout --orphan github 
#     read docs (git help checkout) for the specifics about '--orphan'.  
#     It resets your development history.

# Choose either option 1 or 2 (whatever's easiest)
# OPTION 1: Starting with an empty slate
git rm -rf ./*  # only necessary the first time
git checkout master public mixed/a_public_file
# OPTION 2: Filtering out private files
git rm -rf private mixed/a_private_file

git status
git commit -am "files for public viewing on github"

At this point, we have a master branch and a github branch that exist in the private repo. The github branch contains a subset of files found in the master. Next, we create a GitHub repository on the GitHub website (Dashboard –> New Repository –> Fill out form), and then we register this repo as a remote in our private repo.

# 1. Create a public repository on GitHub (go to their website)
# 2. add a remote:
git remote add public_repo git@github.com:adgaudio/myRepoOnGitHub.git

If, at this point, you don’t want to create a git repo but would prefer to test locally, you can try this instead:

# 1. Emulate creating a public repo on GitHub:
cd ..
git init --bare fake_public_repo
# 2. add a remote:
cd private_repo
git remote add public_repo ../fake_public_repo

Ordinariliy, to push our branch to GitHub, all we’d need to do is this:

# don't do this yet
cd ../private_repo
git push public_repo github:master  # ie. push github to public_repo/master

And we’re done! However, being lazy (I mean efficient!) programmers, perhaps we’d like to automate pushing the github branch to GitHub whenever we make a commit. The trick here is to use a hook that gets called after every commit.

cd ../private_repo/.git/hooks
emacs post-commit

#Add this data to the file:
    #!/bin/bash
    LOG="/tmp/github.log"
    echo "`date`" >> $LOG

    #Push changes to GitHub
     #note - GitHub needs to know your public key
    (git push public_repo github:master 2>&1) >> $LOG 

chmod +x ./post-commit

What this means is that whenever we make a commit to the github branch, its contents automatically get pushed to GitHub!

If you’ve followed so far and you don’t have ssh key configured with GitHub, you may not be able to push your code to your new repo! Let’s go over how to create an ssh key specifically for your GitHub account.

Here’s what I’d suggest you do:

# 1: Create a key specifically for github:
ssh-keygen -f ~/.ssh/id_rsa_github  # Be Careful not to overwrite an existing private key!

# 2: Configure ssh access to github.com to always use your new key
emacs ~/.ssh/config
#Add this to the file
    Host github.com
    User git
    Port 22
    Hostname github.com
    IdentityFile ~/.ssh/id_rsa_github
    TCPKeepAlive yes
    IdentitiesOnly yes

# 3: Navigate your browser to your GitHub profile settings page and
register the # public key you just created (`id_rsa_github.pub`) as an
authorized key.

# 4: Add your existing key to your keyring
# (you may need to do this after every reboot before using the key)
ssh-add ~/.ssh/id_rsa_github

Finally, test your setup by pushing your code

cd ../private_repo
git push origin github

Derivatively Deceptive!

A conceptual explanation of derivatives, partial derivatives, directional derivatives, and differential equations

A derivative can be thought of as a result of comparing two quantities that change with respect to one another.

Imagine a graph with an arbitrarily chosen domain, x, and range, y, and imagine that we plot some function, f, in this plane. This function can only tell you the resulting y for any given any x; it cannot explicitly say whether the next y value is ‘increasing’ or ‘decreasing.’

Enter the derivative: The derivative is a value (returned by the derivative operator) that describes how y changes as you progress through every value of x. More specifically, the derivative determines how much y is changing (increasing or decreasing) relative to a constantly changing x. Because the relationship between x and y is relative, we can switch our point of view to say the derivative also tells us whether x is changing more or less than the change in y. A derivative operator is a function that returns a derivative, or a single value describing how much more (or less) one variable is changing than the other variable in the given original function.

There are derivatives of derivatives. This is also like saying there are functions of functions. In terms of our original function on the x,y plane, the first derivative can be seen as a value that describes how y changes with respect to x. The 2nd derivative, then, describes how the first derivative changes in respect to x, or how the change in y changes with respect to x. The 3rd and 4th derivatives follow in the same manner. Every derivative is the result of passing an input to the derivative operator.

A real world application of this concept is plotting velocity as distance (y) over time (x) in the x,y plane. If velocity is a derivative of the position, defined by funtion g, acceleration is the derivative of g at a given time, and rate of change of acceleration is the next derivative after that.

Partial derivatives are taken for functions with multiple inputs( for instance, f(x,z) = y ) where we assume that one of the variables is constant. If we assume that z is constant for a given x, then we can find the derivative (ie the change in y given x). This is a partial derivative.

A directional derivative is a generalization of the partial derivative. It describes the change in y when both x and z are changing. Imagine we are plotting a 3-D manifold (a 3D object that passes the vertical line test) in the x,y,z coordinate system. Now, slice the object on the (x,z) plane. Here, if we have found all the partial derivatives for one given z, then we have calculated all the partial derivatives (change in y for this x) on the x axis. Similarly, if we find all partial derivatives given one x, we have a group of partial derivatives on the z axis. What this means is that we can define a (x,z) plane containing all the groups of partial derivatives on the z axis (given every x) and all the groups of partial derivatives on the x axis (given every z). Taking this a step further, for a given point in the (x,z) plane, we can define a vector that determines how much to weight to place the partial derivative given z and the partial derivative given x. The value of this vector is the directional derivative.

Derivative notation: The 4 main notations are named after their creators: Leibniz, Lagrange, Newton, Euler. You should Wikipedia ‘Derivative’ for the math and symbols.

Lastly, and probably too briefly, differential equations: A differential equation states that the sum of all derivatives for a given function defines the given function. So, solving a differential equation is usually a matter of identifying all the derivatives that make up a function. In many cases, there are an infinite or unknown amount of derivatives, but in simple equations, there are only one or two derivatives (the rest are equal to zero).


If you’re still interested, get your hands dirty with some math!

GitHub Thingiverse StackOverflow Twitter LinkedIn Facebook

Blog Posts