My 404s Don't code and drive


Terminal Search For Non-Ascii Characters

Sometimes compilers of all sorts gives you trouble for non-ascii characters usually without no feedback as to where in the code they are located. One can then do a directory or file search for the pesky non-ascii characters. I found this discussion on Stack Overflow. What you do is a grep with regular expression that captures all non-ascii characters. To do this over for example your site directory like this,

grep --color='auto' -P -n '[^\x00-\x7F]' _site/*

or on OSX

pcregrep --color='auto' -n "[^\x00-\x7F]" _site/* 

Or you can just do a regular expression search for[^\x00-\x7F] in your favorite editor.

How To Easily Find The Clipping Parameters For A Figure In Latex

You are in Latex setting up a figure and you want only part of it. Then you can use the clipping parameters of the includegraphics command. Like this,

% l b r t
\includegraphics[width=1.\linewidth,clip,trim=280 225 180 190]{mypic.png}
\caption{Beautiful Picture.}

where trim=280 225 180 190 is the clipping margins as [left,bottom,right,top] or [l,b,r,t].

This can be a tricky endeavor since we have to do trial and error to see how much we should remove to get it right. It is often difficult to see the difference between the figure background and document if they have the same background color. However, if we put a frame around the figure we can better estimate how much space is left to the margin. This can be done by using the package,


We can then do,

% l b r t
\includegraphics[width=1.\linewidth,clip,trim=280 225 180 190,frame]{mypic.png}
\caption{Beautiful Picture.}

this gives a black frame around the image and we can see how much space is left. I have found this extremely useful for pdfs that are exported from Keynote since there is no clipping function in the program.

Using Python's Enumerate With Iterators That Returns Multiple Values

Lets say you are using an iterator that returns multiple values which you pass to function, like this,

results = [myFun(val1,val2,...) for val1,val2,... in myIterable]

Now, the function value you return might depend on where you are in the iteration so we would like to pass an index value to the function. In Python we can do this with enumerate, for example, if we are creating a new array by multiplying each value in an array with its position in the array,

result = [val*idx for idx,val in enumerate(myArray)]

To incorporate this into the iterable that returns multiple values we have to create a tuple,

results = [myFun(val1,val2,...,idx) for idx,(val1,val2,...) in enuemrate(myIterable)]

Notice that even though the val1, val2… is inside the tuple we can still refer to them as arguments for the function.

Pandas - How to replace string values in a column with integer numbers

I found this nice solution on StackOverflow to the problem of replacing strings in the columns of a dataframe. Lets assume that a column named labels contains the label strings [A,B,C] and we want to replace them with [0,1,2]. Then we can do the following,

labels = df['labels'].unique().tolist()
mapping = dict( zip(labels,range(len(labels))) )
df.replace({'labels': mapping},inplace=True)

This code first creates a list of the unique labels, after that it creates a dictionary mapping from label value to a numeric value, and finally does a in place replacing of the label values with the numerical values.

How to clear the terminal buffer

Sometimes you wan to start over in the terminal. On OSX on your Mac there is a convenient shortcut: cmd+k. This will clear your scroll buffer. In Ubuntu there is no such shortcut. What you can do clear and reset the scroll-back is to write

clear && printf '\e[3J'

Of course we can add an alias to our .bashrc so that we can just write, for example, ccc

alias ccc="clear && printf '\e[3J'"