<?xml version="1.0" encoding="UTF-8"?>
<feed
  xmlns="http://www.w3.org/2005/Atom"
  xmlns:thr="http://purl.org/syndication/thread/1.0"
  xml:lang="en"
   >
  <title type="text">Ian Dennis Miller</title>
  <subtitle type="text">Ian Dennis Miller</subtitle>

  <updated>2012-02-12T18:51:46Z</updated>
  <generator uri="http://blogofile.com/">Blogofile</generator>

  <link rel="alternate" type="text/html" href="http://www.iandennismiller.com/blog" />
  <id>http://www.iandennismiller.com/blog/feed/atom/</id>
  <link rel="self" type="application/atom+xml" href="http://www.iandennismiller.com/blog/feed/atom/" />
  <entry>
    <author>
      <name></name>
      <uri>http://www.iandennismiller.com/blog</uri>
    </author>
    <title type="html"><![CDATA[A better way to read (at 350 words per minute)]]></title>
    <link rel="alternate" type="text/html" href="http://www.iandennismiller.com/blog/2012/01/23/a-better-way-to-read" />
    <id>http://www.iandennismiller.com/blog/2012/01/23/a-better-way-to-read</id>
    <updated>2012-01-23T13:56:00Z</updated>
    <published>2012-01-23T13:56:00Z</published>
    <category scheme="http://www.iandennismiller.com/blog" term="acceleration" />
    <summary type="html"><![CDATA[A better way to read (at 350 words per minute)]]></summary>
    <content type="html" xml:base="http://www.iandennismiller.com/blog/2012/01/23/a-better-way-to-read"><![CDATA[<p><em>I know kung fu</em></p>
<p>You know that sequence in <em>The Matrix</em> where Neo plugs a cable straight into his brain for the purpose of learning some mad skillz...  <strong>And it only takes a few seconds.  Whoah!</strong>  I love that, but I wouldn't trade my world of illusion for the speed of learning kung fu that way (because of the killer robots, among other reasons).</p>
<p>In so-called "real life," we have to learn things "the hard way," which takes a "little longer..."  but what if I said it was possible to read papers <em>much</em> faster than you're used to?  As in, 3 or 4 times faster?  As in, 350 words per minute?  Sounds pretty far-fetched, but here's where I offer you the blue pill.  Eat it.  Eat it!</p>
<h2>the voice in your head that reads books to you</h2>
<p>I've been interested in the idea of speed reading for some time now, and there have been a variety of interesting solutions over the years.  The take-home message from speed reading books (at least the ones I've read) is that you need to pace yourself, and that you waste time "vocalizing" the words silently inside your head.  Such books claim that if you can learn how to recognize words without speaking them to yourself, then you'll be faster.</p>
<p>Well, maybe so!  I haven't been able to figure out that trick, but it sounds about right.  What if I told you to perform some mental arithmetic while you read a passage in a book?  As in: count backwards from 103, subtracting by 7 each time.  Now see if you read faster or slower than normal...  This sort of <em>cognitive load argument</em> suggests to me that you would, in fact, read slower if you were pronouncing the words versus <em>not</em>.</p>
<p>Or, consider the situation where you are reading a book and you "space out," causing you to read a whole paragraph without actually remembering anything you just read.  What this suggests to me is that the process of your inner voice reciting the words can become so automatic that <em>you are able to do it without investing any attention in the process</em>.  On that basis, I do think I could train myself to read without pronouncing the words - it just needs to become automatic.</p>
<h2>blink and you'll miss it</h2>
<p>Another interesting approach is <a href="http://en.wikipedia.org/wiki/Rapid_serial_visual_presentation">rapid serial visual presentation</a> (RSVP), in which a computer presents individual words on the screen, several hundred per-minute.  I've tried this, and it's a great solution for keeping the pace, but it has drawbacks.<br />
</p>
<p>The biggest point to detract from RSVP is the significant investment it takes to convert a document to an RSVP-formatable representation.  What do you do with images?  equations? tables? These don't map onto RSVP in any easy manner, making this a non-starter for academic reading.</p>
<h2>co-reading: the better approach</h2>
<p><strong>co-read</strong> <em>(verb)</em>: to visually scan a document while that document's words are raidly spoken to you using text-to-speech (TTS) software.</p>
<p>For all I know, my wife actually invented this technique some time around 2005.  I've never heard about it elsewhere, either before or since.  In its simplest version, you take advantage of your operating system's speech facilities, which are used by the visually impaired for screen-reading.  If you don't have trouble seeing your computer display, then maybe this never occurred to you, but the general idea is that even blind people can use computers... they just use software to speak all of the text to them.</p>
<p>I can't really say much about the Microsoft speech facilities, but as far as TTS is concerned, OS X has taken <em>huge</em> strides in the last half-decade.  The new voices that shipped with OS X Lion are just fantastic, and you can get started co-reading almost immediately.  Like, within 60 seconds.</p>
<p>First, set up a hotkey to begin speaking any words you've selected.  Open System Preferences and click on Speech:</p>
<p><img alt="system preferences" src="/assets/blog_images/coreading/preferences.jpeg" /></p>
<p>Then, click "Speak selected text when the key is pressed".  Make this into a key combination you like; I've chosen Option+Control+S</p>
<p><img alt="speech pane" src="/assets/blog_images/coreading/speech_pane.jpeg" /></p>
<p>Then, open a PDF, select a page at a time, and press the key combo you just chose in order to listen to the words while you follow with your eyes.  Once you're comfortable with that, go back to the Speech preferences and crank up the <em>Speaking Rate</em> of the text-to-speech engine.  This is the horizontal slider in the Speech preferences pane that goes from <em>slow</em> to <em>Normal</em> to <em>Fast</em>.  Suddenly you're co-reading faster than you thought possible.<br />
</p>
<h2>taking it to the next level</h2>
<p>For starters, it can be cumbersome to select the words on each page as you read through a document, so we'll just render the whole document to mp3 - all at once.  This is advantageous in several ways: </p>
<ul>
<li>an mp3 can be paused and resumed at will</li>
<li>by looking at the length of the mp3, you can see exactly how long it will take to read the document</li>
</ul>
<p>Don't underestimate this second point.  The ability to plan out reading, in pre-determined chunks of time, is a huge advantage.</p>
<p>Next, we'll automate this process, and add on some optimizations that skip over the parts of the documents that interrupt the flow of the text (like page headers and citations).</p>
<h3>extracting all of the text from a PDF</h3>
<p>The general process for rendering a PDF to an audio file begins when we extract all of the words from the .PDF, and save them as a .TXT file.  This is easily accomplished with pdftotext, which can be <a href="http://users.phg-online.de/tk/MOSXS/xpdf-tools-3.dmg">downloaded from this link</a>, or <a href="http://mxcl.github.com/homebrew/">compiled using homebrew</a>. (highly recommended!)  The official <a href="http://www.foolabs.com/xpdf/download.html">Xpdf site is here</a>, which includes Windows binaries.</p>
<p>So let's say you have a .PDF called <em>Important Paper.pdf</em> that you want to co-read.  For the sake of this example, <em>Important Paper.pdf</em> is on your Desktop.  First, open the terminal (in Applications/Utilities), change directories to your Desktop, and extract the text.  To accomplish all that, just type the following commands into the terminal command prompt:</p>
<pre><code>cd ~/Desktop
pdftotext "Important Paper.pdf"
</code></pre>
<p>Anyway, this creates a text file called <em>Important Paper.txt</em>.  The quotation marks are important, because this filename has a space in it.  Otherwise, the computer thinks you're dealing with two files (one called Important and the other called Paper.pdf), because that's what spaces mean on the command line.</p>
<p>Also, this last <em>iconv</em> step might be important, just in case the character encoding on the text file ends up being a little funky.  First, try skipping this step, but if the audio steps give you an error, then come back and try this:</p>
<pre><code>iconv -f ISO-8859-1 -t utf8 "Important Paper.txt" &gt; tmpfile
mv tmpfile "Important Paper.txt"
</code></pre>
<h3>rendering an entire TXT document to audio</h3>
<p>Next, use the OS X speech engine to convert the .txt file into a sound file:</p>
<pre><code>say -v Samantha -r 220 --data-format=alac -o "Important Paper.m4a" -f "Important Paper.txt"
</code></pre>
<p>I like to use the <em>Samantha</em> voice, which is one of the new voices that ships with OS X Lion.  <em>Alex</em> is another good choice.  The important part here is the number 220, which is the number of words per minute to speak.  100 is slow, 200 is medium, 300 is fast, and 400 is right on the brink of what I (personally) can meaningfully interpret.</p>
<p>Finally, convert this file to an .mp3, which is likely to be smaller and might be more portable.<br />
</p>
<pre><code>ffmpeg -i "Important Paper.m4a" "Important Paper.mp3"
</code></pre>
<p>Of course, you can skip this step if you want; iTunes imports m4a files without complaining.  You can compile ffmpeg using homebrew, or look at a point-and-click alternative like <a href="http://audacity.sourceforge.net/">Audacity</a>.<br />
</p>
<p>Now, the time has come to do this thing.  Load the .PDF in one window, load the .MP3 in another window, press play on the .MP3, and <em>learn kung fu!</em></p>
<h3>removing citations</h3>
<p>What!?  Remove citations!?  Yes.  Well, sortof.  See, citations are filled with punctuation, including commas, semicolons, and parenthesis.  Usually, text-to-speech software treats these as pauses, and it can really break up the flow.  Also, citations frequently appear in the middle of a sentence, and it's just not conducive to grokking a sentence when it is interrupted.  That's why I wrote this <a href="http://en.wikipedia.org/wiki/Regular_expression">regular expression</a> (regexp), which removes anything inside parenthesis containing letters and something that looks like a year:</p>
<pre><code>\([^\)]+?\d{4}?[^\)]*?\)
</code></pre>
<p>At the moment, I choose to do this as a manual step, because each document is a little different.  I recommend loading the .TXT file in a text editor that supports regexp find-and-replace (emacs, TextMate, Sublime Text, others), and just replace everything with nothingness.  If you're bold and reckless, then go ahead...  do it without looking:</p>
<pre><code>perl -pe 's/\([^\)]+?\d{4}?[^\)]*?\)//g' "Important Paper.txt" &gt; tmpfile
mv tmpfile "Important Paper.txt"
</code></pre>
<p>After you've done this, then go back and render the .TXT file to an mp3.  Now, you can look at the citations (you'll recognize familiar author names visually), but you won't get tripped up when sentences are split in half by references.</p>
<p>Here is another good one: removing hyphenation.</p>
<pre><code>perl -pe 's/-[\s\n]+//g' "Important Paper.txt" &gt; tmpfile
mv tmpfile "Important Paper.txt"
</code></pre>
<p>In fact, you can edit this text file to your heart's content.  Don't want the bibliography?  Just delete it, because you're not going to want to listen to it.  Get rid of page numbers, page headings and footers, and anything else that isn't the actual content of the paper.  If you don't want to hear it, delete it.</p>
<h3>automating the whole process</h3>
<p>Save the following as <strong>co-read.sh</strong>, or <a href="https://github.com/iandennismiller/co-read">download it from github</a>.</p>
<div class="pygments_murphy"><pre>    <span class="c">#!/bin/bash</span>

    <span class="nv">WPM</span><span class="o">=</span>300
    <span class="nv">INFILE</span><span class="o">=</span><span class="nv">$1</span>
    <span class="nv">BASENAME</span><span class="o">=</span><span class="sb">`</span>basename -s .pdf <span class="nv">$INFILE</span><span class="sb">`</span>
    <span class="nv">TXTFILE</span><span class="o">=</span>/tmp/tmp.txt
    <span class="nv">SNDFILE</span><span class="o">=</span>/tmp/tmp.m4a
    <span class="nv">MP3FILE</span><span class="o">=</span><span class="nv">$BASENAME</span>.mp3
    <span class="nv">TMPFILE</span><span class="o">=</span>/tmp/tmpfile

    <span class="nb">echo</span> <span class="s2">&quot;extracting text from $INFILE&quot;</span>

    pdftotext <span class="s2">&quot;$INFILE&quot;</span> <span class="s2">&quot;$TXTFILE&quot;</span>
    iconv -f ISO-8859-1 -t utf8 <span class="s2">&quot;$TXTFILE&quot;</span> &gt; <span class="nv">$TMPFILE</span>

    say -v Samantha -r <span class="nv">$WPM</span> --data-format<span class="o">=</span>alac -o <span class="s2">&quot;$SNDFILE&quot;</span> -f <span class="nv">$TMPFILE</span>
    ffmpeg -i <span class="s2">&quot;$SNDFILE&quot;</span> <span class="s2">&quot;$MP3FILE&quot;</span>

    <span class="nb">echo</span> <span class="s2">&quot;wrote to $MP3FILE&quot;</span>
</pre></div>

<p>Then, make it executable and test it out with your Important Paper:</p>
<pre><code>./co-read.sh "Important Paper.pdf"
</code></pre>
<p>That's it.  This chugs along, producing an audio file (the mp3) that goes at 300 words per minute.  If you want it to go slower, then change $WPM to 250 or something.  You have the mp3 now, so get cracking!</p>
<h3>how fast is it?</h3>
<p>How fast does this whole process go?  Fast!  I can responsibly get through 25 dense pages in about 80 minutes.  The rendering process takes less than 10 minutes.  Like I said, since you can look at the length of the MP3 to determine how long the document will take, I can also budget my time better...  and that contributes to further time savings because I only do this when I'm in the right state of mind.  (which is to say: after chugging a pot of green tea)</p>
<h2>a caveat, and an appeal</h2>
<p>This process is written for OS X, but it will wwork almost as well on Linux using Festival/Festivox TTS.  If someone would adapt the process to Windows and post a comment about it, I'm sure others will appreciate that.</p>
<h2>(an aside: when the PDF doesn't "highlight")</h2>
<p>There's something you need to understand about PDFs.  Much of the time, they are basically just pictures, which have been scanned from sheets of paper, and which are stored in the PDF as a collection of pictures.  This makes as much sense to TTS software as reading a picture of a sunset or a kitten (which is to say it doesn't make any sense at all).<br />
</p>
<p>In order to make this PDF "readable", it must be passed through an Optical Character Recognition (OCR) processor.  If you purchased a scanner, you might have some OCR software lurking on your computer.  Adobe Acrobat X Pro does a pretty good job - I'd go so far as to recommend it.<br />
</p>
<p>OCR software will look at pictures, then try to notice anything in the picture that looks like a letter.  If it finds letters, then it annotates the PDF by putting an invisible, selectable letter on top of the picture of that letter.  Later on, you can use your mouse to highlight these invisible letters, but it will look just about right because the pictures of the real letters are right underneath.<br />
</p>
<p>When you paste the clipboard, it's probably going to contain the text you just highlighted...  This specific detail comes down to the quality of the OCR software used, as well as the quality of the scanned image.</p>]]></content>
  </entry>
  <entry>
    <author>
      <name></name>
      <uri>http://www.iandennismiller.com/blog</uri>
    </author>
    <title type="html"><![CDATA[Scanning a textbook]]></title>
    <link rel="alternate" type="text/html" href="http://www.iandennismiller.com/blog/2012/01/18/scanning-a-textbook" />
    <id>http://www.iandennismiller.com/blog/2012/01/18/scanning-a-textbook</id>
    <updated>2012-01-18T21:44:00Z</updated>
    <published>2012-01-18T21:44:00Z</published>
    <category scheme="http://www.iandennismiller.com/blog" term="diy" />
    <summary type="html"><![CDATA[Scanning a textbook]]></summary>
    <content type="html" xml:base="http://www.iandennismiller.com/blog/2012/01/18/scanning-a-textbook"><![CDATA[<p>Textbooks are a thing of the past.  No, that's not exactly what I mean to say.  More precisely, the physical format of textbooks is a little bulkier than necessary.  I've adopted a new approach, which is to scan my textbooks, and then read them on my tablet.  It's lighter, I can keep all my textbooks with me, and they are fulltext searchable.  I can highlight and annotate the PDF, and it's all good.  The only downside is that some textbook publishers <em>have got it down</em>, and they put out some beautiful, archival-quality works.  The $150 price tag might be steep, but in some cases, the quality is so high that it just might be worth it.  So, as long as you're comfortable with the idea that your entire academic library can be lost in an un-backed-up instant, then you're ready to digitize your textbooks.  Read on.</p>
<h2>the tools for unbinding</h2>
<p>It starts with the tools.  I use a box cutter and a guillotine-style paper cutter to unbind the textbook.</p>
<p><img alt="box cutter and paper cutter" src="/assets/blog_images/unbinding/01_raw_materials.jpeg" /></p>
<p>Then, to bulk-scan everything, I use the amazing Fujitsu ScanSnap.  It's probably not worth the time to do this job with a flat-bed scanner.  Yes, these scanners are $400-$500, but I'll argue that it's worth it, if only for the home-office angle.  Get one, or use the one you might have access to at work.</p>
<p><img alt="fujitsu scansnap" src="/assets/blog_images/unbinding/02_scansnap.jpeg" /></p>
<p>The general idea is to split the textbook into individual pages, then stream all of those pages through a form-feed scanner.  A book is created when separate pages are glued together at the spine, so to un-create said book (returning it to its elemental pages), simply destroy the spine.  I've heard of people using a table saw for the job, but my method is a little more apartment-friendly.  We're simply trying to separate the pages from the glue, and a blade is fine for the job.</p>
<h2>cut the book into segments</h2>
<p>Start by extending the blade about an inch or so.  This isn't a half-hearted thing.  We want to cut straight-through the glue, all the way.</p>
<p><img alt="extend the blade" src="/assets/blog_images/unbinding/03_blade_extended.jpeg" /></p>
<p>Then, grab about 10 or 15 sheets.</p>
<p><img alt="10 or 15 sheets" src="/assets/blog_images/unbinding/04_bite_sized_chunks.jpeg" /></p>
<p>Fold the spine so it's flush against the ground...</p>
<p><img alt="spine against the ground" src="/assets/blog_images/unbinding/05_fold_the_binding_to_ground.jpeg" /></p>
<p>...and cut straight through the glue of the spine.  Take note that we're not cutting through the paper.  That was my first strategy, but it's way too much work.</p>
<p><img alt="cut through binding glue" src="/assets/blog_images/unbinding/06_cut_through_binding_glue.jpeg" /></p>
<p>You will be slicing between the pages, right through the binding.</p>
<p><img alt="slice between pages" src="/assets/blog_images/unbinding/07_cut_along_binding.jpeg" /></p>
<p>After the first cut, you will have a packet of 15 pages, still glued together.  It is like a little pamphlet now.</p>
<p><img alt="after one cut" src="/assets/blog_images/unbinding/08_after_one_cut.jpeg" /></p>
<h2>After the book has been segmented</h2>
<p>Keep cutting through the spine, 10 or 15 pages at a time.  When you're done, you will have lots of pamphlets that are largely intact.  If you're doing it right, you will have created a minimal amount of scraps/waste.</p>
<p><img alt="minimal waste" src="/assets/blog_images/unbinding/09_shreds_from_the_spine.jpeg" /></p>
<p>The reason I suggest doing 10 or 15 pages at a time is that the guillotine-style paper cutter needs to be able to slice through those pages, because this is how you actually remove the glue from the spine.  In the picture below, the spine is hanging over the edge of the paper cutter by about half a centimeter.<br />
</p>
<p><img alt="removing the spine" src="/assets/blog_images/unbinding/10_align_on_guillotine.jpeg" /></p>
<p>It's important to cut off enough of the spine that you remove all of the glue.  If you don't, then some of the pages will still be stuck together, and they won't go through the form-feed scanner properly.  This is the biggest source of jams I've encountered so far.<br />
</p>
<p>On the other hand, you don't want to cut off so much of the page that you lose any of the text.  It also looks nicer when the text isn't flush against the edge of the page.  Once you've found the right amount to cut off, set the guide on your paper cutter (if it has one) to ensure that all of the 15-page packets end up being cut at the same point.</p>
<p><img alt="the handle is a suggestion" src="/assets/blog_images/unbinding/11_apply_pressure_in_the_middle.jpeg" /></p>
<p>Even though my paper cutter has a handle, I've found that this is not always the most effective way to cut.  The handle implies that you should use it to apply all of the force, but this is just a suggestion.  In the picture above, I'm pressing on the middle of the blade, but I would be remiss if I didn't mention the risk of slicing all the fingers off your hand.</p>
<p><img alt="the removed binding" src="/assets/blog_images/unbinding/12_the_removed_binding.jpeg" /></p>
<p>Once that's done, you will have a stack of paper consisting of all the pages of the textbook, and a pile of spine left over.  Now feed the stack through the scanner, using the OCR software that probably came with it.  I assume you've done this part before, so just do what feels right.</p>
<p>Congratulations!  You don't have to lug your textbooks around anymore!  If you want, you can now re-bind your textbook, but that's beyond the scope of this article.  I've done it before, so maybe I'll write about that some time.</p>
<h2>Don't be an asshole</h2>
<p>I'll also say something else: now that you have a PDF of your textbook, don't share it with people who haven't bought the book.  I know it seems like a rip-off when you pay $150 for a textbook, but it's not like academics become millionaires from their textbooks.  Think of these authors as being like small-time artists.  If they're lucky, they sell a few thousand copies per year, and they get some fraction of the sale price.  Just like with the music industry, the publisher takes most of the money.  So let's say some bent-over academic sells 2,000 copies of their book each year for 10 years, and they get $15 for each one.  This $30,000 per year won't put them in a different tax bracket, and you, as a student, are not getting personally ripped off by the person who wrote the book.<br />
</p>
<p>So: don't be a dick.  For the most part, these are good people, and the way things are going, university-level academics tend to live lower-middle-class lives.  The media likes to portray academia one way, but like everything else, this is a distortion.  I will reiterate: don't steal from these people.  If you're rich enough to be in college, then your family may well be richer than your professors, and only an asshole steals from people poorer than themselves.</p>]]></content>
  </entry>
  <entry>
    <author>
      <name></name>
      <uri>http://www.iandennismiller.com/blog</uri>
    </author>
    <title type="html"><![CDATA[The rise of image macros]]></title>
    <link rel="alternate" type="text/html" href="http://www.iandennismiller.com/blog/2012/01/15/the-rise-of-image-macros" />
    <id>http://www.iandennismiller.com/blog/2012/01/15/the-rise-of-image-macros</id>
    <updated>2012-01-15T11:55:33Z</updated>
    <published>2012-01-15T11:55:33Z</published>
    <category scheme="http://www.iandennismiller.com/blog" term="essay" />
    <summary type="html"><![CDATA[The rise of image macros]]></summary>
    <content type="html" xml:base="http://www.iandennismiller.com/blog/2012/01/15/the-rise-of-image-macros"><![CDATA[<p>The term meme (pronounced like gene) was coined by Richard Dawkins (1976) in The Selfish Gene to apply the vocabulary of genetics to questions of culture. Although the term is applicable to any sort of cultural object that can be imitated and mutated, the Internet Meme has risen to particular prominence. An early example of an Internet Meme is the “Eternal September” (Fisher, 1994), in which a Usenet (Daniel, Ellis, &amp; Truscott, 1980) post to alt.folklore.computers demarcated the Internet's transition from a relatively small academic community to the exponentially expanding network of modern times.</p>
<p>With the advent of the World Wide Web (Berners-Lee &amp; Cailliau, 1990) came the introduction and growth of web-based forums, such as the notable online community Something Awful (Kyanka, 1999). The Web was used to propagate one of the first widely-reposted animations, Dancing Baby (Girard et al., 1996), and by the year 2000, the Something Awful community mainstreamed one of the first popularly mutated Internet Memes, All Your Base Are Belong To Us (Dibbell, 2008). </p>
<p>The proliferation of broadband Internet and peer to peer file sharing (e.g. Napster, Kazaa) enabled increasingly sophisticated video sharing, setting the stage for online video phenomena such as Star Wars Kid (Raza, 2002). By 2006, the BBC estimated Star Wars Kid had been viewed over 900 million times (“Star Wars Kid is top viral video,” 2006), which was accomplished through such rapid online retransmission that its trajectory resembled that of a viral pandemic. Capitalizing on this trend, YouTube launched (Chen, Hurley, &amp; Karim, 2005) to become a popular online repositories of viral videos.</p>
<p>In 2004, Something Awful community members coined the term image macro, which was named after the mechanism used to insert images into forum posts (“Image Macro,” 2004). Image macros are characterized as a background picture with one or two lines of text captioning overlaid onto the image, usually for ironic or comedic effect (see Figures 1 and 2). In the same year, a Something Awful community member named “moot” founded 4chan (Poole, 2004), which was an image board (modelled after the popular Japanese forum 2ch) that came to be known for its blanket use of the pseudonym Anonymous and as a prolific incubator of memes. </p>
<p>4chan, in turn, helped launch an early class of image macros known as “LOLcats” (Langton, 2007), which are recognizable as pictures of cats with phonetically or grammatically erroneous captions (e.g. “I can has cheezburger”). LOLcat image macros were collected on a popular blog entitled icanhascheezburger.com (Nakagawa &amp; Unebasami, 2007), which became so heavily trafficked that it was sold to investors within the year for $2 million (Grossman, 2008).</p>
<p>In 2009, Time Magazine named 4chan's moot as the year's most influential person, even surpassing politicians, celebrities, and criminals for the title (“The World’s Most Influential Person Is...,” 2009). It was later revealed that the Time poll had been so thoroughly hacked by Anonymous as to manipulate not simply the #1 spot, but also #2-#21, in order to create an acrostic spelling “mARBLECAKE. ALSO, THE GAME” (Schonfeld, 2009), both of which were memes created by 4chan. </p>
<p>As a result of stunts like the Time Magazine hack, the public visibility of memes, and image macros in particular, created demand for simple and user-friendly tools such as quickmeme.com (Wayne, 2010) that enabled novice users with no image-manipulation experience to quickly create image macros. More recently, Cheezburger Inc. raised an additional $30 million from investors to continue the expansion of their commercial image macro/comedy empire (Crunchbase, 2012), while the Canadian magazine Adbusters used a professionally-crafted image macro to launch the Occupy Wall Street movement (Beeston, 2011).</p>
<h2>References</h2>
<p>Beeston, L. (2011, October 11). The Ballerina and the Bull. The Link. Retrieved from <a href="http://thelinknewspaper.ca/article/1951">http://thelinknewspaper.ca/article/1951</a></p>
<p>Berners-Lee, T., &amp; Cailliau, R. (1990). WorldWideWeb: Proposal for a HyperText Project. CERN. Retrieved from <a href="http://www.w3.org/Proposal.html">http://www.w3.org/Proposal.html</a></p>
<p>Chen, S., Hurley, C., &amp; Karim, J. (2005). YouTube. Retrieved from <a href="http://www.youtube.com">http://www.youtube.com</a></p>
<p>Crunchbase. (2012). Cheezburger. Retrieved from <a href="http://www.crunchbase.com/company/pet-holdings-inc">http://www.crunchbase.com/company/pet-holdings-inc</a></p>
<p>Daniel, S., Ellis, J., &amp; Truscott, T. (1980). USENET. Retrieved from <a href="http://ftp.digital.com/pub/news/a/a.news.tar.Z">http://ftp.digital.com/pub/news/a/a.news.tar.Z</a></p>
<p>Dawkins, R. (1976). The selfish gene. New York: Oxford University Press.</p>
<p>Dibbell, J. (2008, January 18). Mutilated Furries, Flying Phalluses: Put the Blame on Griefers, the Sociopaths of the Virtual World. Wired, 16(2). Retrieved from <a href="http://www.wired.com/gaming/virtualworlds/magazine/16-02/mf_goons">http://www.wired.com/gaming/virtualworlds/magazine/16-02/mf_goons</a></p>
<p>Fisher, D. (1994, January 26). Weeks? hah!! alt.folklore.computers. Retrieved from <a href="http://groups.google.com/group/alt.folklore.computers/msg/4bd75d223b992e8d">http://groups.google.com/group/alt.folklore.computers/msg/4bd75d223b992e8d</a></p>
<p>Girard, M., Amkraut, S., Chadwick, J., Bloemink, P., Hutchinson, J., &amp; Felt, A. (1996). Dancing Baby. Character Studio.</p>
<p>Grossman, L. (2008, undefined). The Master Of Memes. Time. Retrieved from <a href="http://www.time.com/time/magazine/article/0,9171,1821656,00.html">http://www.time.com/time/magazine/article/0,9171,1821656,00.html</a></p>
<p>Image Macro. (2004, February 12).Something Awful. SAclopedia. Retrieved January 8, 2011, from <a href="http://forums.somethingawful.com/dictionary.php?act=3&amp;topicid=83">http://forums.somethingawful.com/dictionary.php?act=3&amp;topicid=83</a></p>
<p>Kyanka, R. (1999). Something Awful. Retrieved from <a href="http://www.somethingawful.com">http://www.somethingawful.com</a></p>
<p>Langton, J. (2007, September 22). Funny how `stupid’ site is addictive. The Toronto Star. Retrieved from <a href="http://www.thestar.com/living/article/257955">http://www.thestar.com/living/article/257955</a></p>
<p>Nakagawa, E., &amp; Unebasami, K. (2007). I Can Has Cheezburger. Retrieved from <a href="http://www.icanhascheezburger.com">http://www.icanhascheezburger.com</a></p>
<p>Poole, C. (2004). 4chan.org. Retrieved from <a href="http://www.4chan.org">http://www.4chan.org</a></p>
<p>Raza, G. (2002). Star Wars Kid.</p>
<p>Schonfeld, E. (2009, April 21). 4Chan Takes Over The Time 100. The Washington Post. Retrieved from <a href="http://www.washingtonpost.com/wp-dyn/content/article/2009/04/21/AR2009042101864.html">http://www.washingtonpost.com/wp-dyn/content/article/2009/04/21/AR2009042101864.html</a></p>
<p>Star Wars Kid is top viral video. (2006, November 27).BBC. Retrieved from <a href="http://news.bbc.co.uk/2/hi/entertainment/6187554.stm">http://news.bbc.co.uk/2/hi/entertainment/6187554.stm</a></p>
<p>The World’s Most Influential Person Is... (2009, April 27).Time. Retrieved from <a href="http://www.time.com/time/arts/article/0,8599,1894028,00.html">http://www.time.com/time/arts/article/0,8599,1894028,00.html</a></p>
<p>Wayne. (2010). quickmeme. quickmeme LLC. Retrieved from <a href="http://www.quickmeme.com">http://www.quickmeme.com</a></p>]]></content>
  </entry>
  <entry>
    <author>
      <name></name>
      <uri>http://www.iandennismiller.com/blog</uri>
    </author>
    <title type="html"><![CDATA[Xenadu: a tool for managing system configurations]]></title>
    <link rel="alternate" type="text/html" href="http://www.iandennismiller.com/blog/2011/05/xenadu-a-tool-for-managing-system-configurations/" />
    <id>http://www.iandennismiller.com/blog/2011/05/xenadu-a-tool-for-managing-system-configurations/</id>
    <updated>2011-05-01T01:26:23Z</updated>
    <published>2011-05-01T01:26:23Z</published>
    <category scheme="http://www.iandennismiller.com/blog" term="unix" />
    <category scheme="http://www.iandennismiller.com/blog" term="open source" />
    <category scheme="http://www.iandennismiller.com/blog" term="original" />
    <summary type="html"><![CDATA[Xenadu: a tool for managing system configurations]]></summary>
    <content type="html" xml:base="http://www.iandennismiller.com/blog/2011/05/xenadu-a-tool-for-managing-system-configurations/"><![CDATA[<p>It can be a challenge to administer multiple Linux machines, particularly when this isn't your primary job.  I deal with about 6 machines on a daily basis, and the various configurations tend to blur together.  If I didn't have a system for keeping it straight, I'd be lost.  I know because I still have memories of "the time before the system" and it was madness.</p>
<p>I tried a few existing configuration management tools (<a href="http://www.cfengine.org/">cfengine</a>, <a href="http://www.puppetlabs.com/">puppet</a>, <a href="http://trac.mcs.anl.gov/projects/bcfg2">bcfg2</a>) and everything seemed to be more complex than I wanted, so I wrote my own: <a href="http://github.com/iandennismiller/xenadu">Xenadu</a>. Originally, Xenadu was specialized for creating Xen guest images, but I now use it to administer both virtual and physical machines.  It's one of those "makes the impossible possible" tools, for me, and I'm actually using it on a few production systems.</p>
<p>My primary goal with Xenadu is to track my system administration with a version control program like git.  By doing this, I essentially get automatic documentation and backups of my sysadmin activities.  If I made a change to a system and it broke, I would always be able to get a previous version of the system configuration out of git.</p>
<p>Once you have fully defined a system with Xenadu, it becomes quite easy to create other machines that have similar configurations.  This is great for keeping a stage/production environment in sync, or for creating several instances of a particular production server.</p>
<p>An example <a href="http://github.com/iandennismiller/xenadu">Xenadu</a> configuration for a computer named "davidbowie" might look like this:</p>
<div class="pygments_murphy"><pre><span class="c">#!/usr/bin/env python</span>
<span class="kn">from</span> <span class="nn">Xenadu</span> <span class="kn">import</span> <span class="n">XenaduConfig</span><span class="p">,</span> <span class="n">Perm</span>
<span class="n">mapping</span> <span class="o">=</span> <span class="p">[</span>
    <span class="p">[</span><span class="s">&#39;/etc/hosts&#39;</span><span class="p">,</span> <span class="s">&quot;hosts&quot;</span><span class="p">,</span> <span class="n">Perm</span><span class="o">.</span><span class="n">root_644</span><span class="p">],</span>
    <span class="p">[</span><span class="s">&#39;/etc/network/interfaces&#39;</span><span class="p">,</span> <span class="s">&quot;interfaces&quot;</span><span class="p">,</span> <span class="n">Perm</span><span class="o">.</span><span class="n">root_644</span><span class="p">],</span>
    <span class="p">]</span>
<span class="n">env</span> <span class="o">=</span> <span class="p">{</span> <span class="s">&#39;ssh&#39;</span><span class="p">:</span> <span class="p">{</span> <span class="s">&quot;user&quot;</span><span class="p">:</span> <span class="s">&quot;root&quot;</span><span class="p">,</span> <span class="s">&quot;address&quot;</span><span class="p">:</span> <span class="s">&quot;davidbowie.example.com&quot;</span> <span class="p">}</span> <span class="p">}</span>
<span class="n">XenaduConfig</span><span class="p">(</span><span class="n">env</span><span class="p">,</span> <span class="n">mapping</span><span class="p">)</span>
</pre></div>

<p>Save the file as davidbowie.py, and you're ready to manage that computer.  In this case, two files (hosts and interfaces) are going to be tracked by Xenadu.  Now you can edit the files locally, check them into version control, etc.  If you've edited your local copy of hosts and want to send it to your remote system:</p>
<pre><code>./davidbowie.py --push hosts
</code></pre>
<p>A particularly useful feature is that you can specify a file according to its local or remote filename, so the following are equivalent:</p>
<pre><code>./davidbowie.py --push hosts
./davidbowie.py --push /etc/hosts
</code></pre>
<p>You can also <code>--get</code>, which downloads a file, and <code>--getall</code>, which downloads every file specified in your Xenadu system definition.  Check out the readme file on the <a href="http://github.com/iandennismiller/xenadu">Xenadu</a> website for more information.</p>
<p>In the olden days, I would perform system administration by logging onto the system itself, editing files as I went along.  You can still do this with Xenadu, but make sure to <code>--get</code> your changes from the remote host, so that you can check it into version control.  My current workflow is to make changes locally, then <code>--push</code> them to the remote host.  Give it a try!</p>]]></content>
  </entry>
  <entry>
    <author>
      <name></name>
      <uri>http://www.iandennismiller.com/blog</uri>
    </author>
    <title type="html"><![CDATA[The Giraffe Gaffe]]></title>
    <link rel="alternate" type="text/html" href="http://www.iandennismiller.com/blog/2011/03/the-longislandpress-com-giraffe-gaffe/" />
    <id>http://www.iandennismiller.com/blog/2011/03/the-longislandpress-com-giraffe-gaffe/</id>
    <updated>2011-03-31T16:09:56Z</updated>
    <published>2011-03-31T16:09:56Z</published>
    <category scheme="http://www.iandennismiller.com/blog" term="journalism" />
    <category scheme="http://www.iandennismiller.com/blog" term="original" />
    <category scheme="http://www.iandennismiller.com/blog" term="internet" />
    <summary type="html"><![CDATA[The Giraffe Gaffe]]></summary>
    <content type="html" xml:base="http://www.iandennismiller.com/blog/2011/03/the-longislandpress-com-giraffe-gaffe/"><![CDATA[<p>As John Vinson <a href="http://www.webpronews.com/directv-petite-lap-giraffe-2011-03">reports on WebProNews</a>: "If you ever need proof that truth is stranger than fiction, simply <a href="http://www.iandennismiller.com/blog/2011/03/total-bummer-longislandpress-com-plagiarism-and-coverup/">give this story a read</a>." He's referring to a crazy series of events involving a New York-area newspaper and an article I posted to my blog on Tuesday titled, "<a href="http://www.iandennismiller.com/blog/2011/03/total-bummer-longislandpress-com-plagiarism-and-coverup/">When a newspaper rips off your blog, then taunts you about it</a>". (more coverage <a href="http://www.poynter.org/latest-news/romenesko/125786/blogger-long-island-press-lifted-details-from-my-petite-lap-giraffe-story">here</a>, <a href="http://www.mediabistro.com/fishbowlny/blogger-says-long-island-press-ripped-off-his-work-then-taunted-him-over-it_b31528">here</a>, and <a href="http://poisonyourmind.wordpress.com/2011/03/30/wouldnt-take-skin-off-your-back/">here</a>) There was a pretty vigorous response to the story, to put it mildly!  Since everyone has now had 24 hours to calm down, I'm going to present the complete sequence of events, and close with a brief update on the current state of the situation.</p>
<p>This story really starts at 8:59PM on March 23rd, when I noticed <a href="http://twitter.com/#!/mcuban/status/50723416478724096">this tweet from Mark Cuban</a>:</p>
<blockquote>
<p>Got One !!! Finally..<a href="http://www.petitelapgiraffe.com/">http://www.petitelapgiraffe.com/</a></p>
</blockquote>
<p>Since I had never seen any advertising related to this website, I was completely floored.  If you have no idea what this is about, take a minute to <a href="http://www.petitelapgiraffe.com/">check it out</a>.  Immediately, I had questions: Could these animals really exist?  Are they so rare that only billionaires can afford them?  How does a Russian farm come up with such a polished website?  (that work wasn't cheap) <em>What of the live video feed!? </em>I'll admit it straight-up: I totally bought into the illusion.  It was brilliant!  As I tried to learn more, I grew increasingly (and tragically) skeptical, eventually <a href="/blog/2011/03/petite-lap-giraffes/">composing a blog entry</a> at 9:56PM debunking the advertisements. Then, at 10:01PM on March 23rd, I posted <a href="http://twitter.com/#!/IanDennisMiller/status/50739012872310784">this update to twitter</a>:</p>
<blockquote>
<p>I want a Petite Lap Giraffe! <a href="/blog/2011/03/petite-lap-giraffes/">http://www.iandennismiller.com/blog/2011/03/petite-lap-giraffes/</a> <a href="http://twitter.com/#!/search?q=%23petitelapgiraffe">#petitelapgiraffe</a></p>
</blockquote>
<p>It turns out I wasn't the only person who wanted to know if these animals were real, and thousands of people started visiting my blog to find out the answer.  I had approached the viral ads like a puzzle, so it was a lot of fun for me to debunk the myth.  A lot of people seemed to be genuinely disappointed that the giraffes weren't real, so I spent some additional time consoling the desolate and despondent, who were populating my blog's comments section.</p>
<p>For the next few days, I <a href="https://encrypted.google.com/search?q=petite+lap+giraffe">Googled for "petite lap giraffe"</a> about once per day, and idly clicked through the other coverage of the meme.  I had been following a sequence of articles posted by the LongIslandPress.com (since they seemed to be the only newspaper publishing anything about it), but around 2:00PM on Tuesday March 29th, <a rel="nofollow" href="http://www.longislandpress.com/2011/03/28/petite-lap-giraffes-real-or-directv-marketing-campaign/">one of their articles <em>particularly</em> caught my eye</a>.  It had been published Monday, March 28th at 5:02PM.  Remember that time, because it's going to become really important a little later on.</p>
<p>As with the original Petite Lap Giraffe debunking, I started sensing that <em>something was wrong</em> with this article.  I wish I could link to the original article, but alas, it has since been swallowed up by the memory hole.  Fortunately, I created a PDF archive, so for posterity's sake, this screenshot will serve as a mirror of the original LongIslandPress.com article:</p>
<p><img alt="the original LongIslandPress.com article" src="/assets/blog_images/Screen-shot-2011-03-31-at-2.04.58-PM.png" /></p>
<p>There were three sentences that jumped out at me, particularly the one about the stock image.  In my article debunking the giraffes, that was the detail that had sealed the deal for me.  But had they just claimed they were the ones who performed this research?  It sounded that way to me, so at 2:07PM I posted the following comment in response to the article:</p>
<blockquote>
<p>I’m a little disappointed that you didn’t city [sic] my March 23 blog post on this topic, since it is the original source of the information you mention in this article. I thought it was standard to provide an attribution?</p>
<p><a href="http://www.iandennismiller.com/blog/2011/03/petite-lap-giraffes">http://iandennismiller.com/blog/2011/03/petite-lap-giraffes</a></p>
</blockquote>
<p>No big deal.  I checked back around 2:40PM, and it appeared as if my comment had been deleted.  That kindof irked me, so at 2:44PM, I tried again (this time, a little more forcefully):</p>
<blockquote>
<p>I’m disappointed you didn’t city [sic] my March 23 blog post, where I actually conducted the research you are taking credit for. Specifically, I uncovered the link to the Grey Group, and I also discovered the “hot tub” stock image. At a minimum, you should provide attribution:</p>
<p><a href="http://www.iandennismiller.com/blog/2011/03/petite-lap-giraffes">http://iandennismiller.com/blog/2011/03/petite-lap-giraffes</a></p>
<p>...but it’s dishonest to claim this research as your own. Usually, it’s considered plagiarism.</p>
</blockquote>
<p>Again, it seemed like my comment had been deleted, but I soon realized that it was actually being held for moderation, which is a pretty normal thing in the blog world.  Again, no big deal.  More waiting. When I checked back at 3:05PM, there was a response! However, this was not the response I was expecting.  Someone at LongIslandPress.com had altered the original article, replacing this sentence:</p>
<blockquote>
<p>And the cute little guy in the bath tub? Well, that’s a stock image with the cute little guy added in.</p>
</blockquote>
<p>with this sentence:</p>
<blockquote>
<p>A quick domain name lookup...which is free and public information...will give you those details.</p>
</blockquote>
<p>Here is my record of their article, at that time (3:05PM).</p>
<p><img alt="LongIslandPress.com had altered the original article" src="/assets/blog_images/Screen-shot-2011-03-31-at-2.21.54-PM.png" /></p>
<p>By this point, I was pretty sure someone was trying to cover something up, so I told some of my friends about the situation and began summarizing my findings in <a href="/blog/2011/03/total-bummer-longislandpress-com-plagiarism-and-coverup/">a new blog article</a>, which I published at 3:39PM.  The next step happened when a friend pinged me, pointing out that the article had been updated yet again.  By 4:23PM, the article included this sentence:</p>
<blockquote>
<p>A quick domain name lookup...which is free and public information...will give you those details, <strong>which we acquired–you know, being a newspaper with research capabilities and all–of our own accord (although some are trying to claim this information as their own “discovery” as a way to promote their own personal website! But enough of that...)</strong></p>
</blockquote>
<p>Here is a screenshot of the article at 4:23PM:</p>
<p><img alt="the article had been updated yet again" src="/assets/blog_images/Screen-shot-2011-03-31-at-2.27.10-PM.png" /></p>
<p>It seemed like they had clearly received my comment, and although they were refusing to publish it, they were certainly responding to it!  Shortly after I was notified of this latest edit, I posted my final comment on their article:</p>
<blockquote>
<p>Whatever - I think you did a pretty lame thing here.  You deleted the detail about the stock photo and are trying to make it sound like you did the rest on your own.  ...but I saved a copy of all 3 versions of your article, and it's pretty clear you know which details you lifted.  I'm kindof amazed at how shameless you are about this (really, all I asked for was proper attribution) but I actually don't have time to pursue this further.</p>
<p>Captcha: You win - lol</p>
</blockquote>
<p>This comment seems to have been deleted outright, rather than being held for moderation,  but I <em>did</em> have a ton of work to do, and I <em>didn't</em> want to deal with this right now.  But hey: this is what friends are for.  They kept asking me questions about the article, and I updated my own article to mention LongIslandPress.com's inflammatory remark.  It seemed to me like LongIslandPress.com had provided a <em>de facto</em> admission of their deeds, so I started asking some forums for advice about how to report a journalistic ethics complaint.  At 8:46PM, I submitted the following blurb to <a href="http://slashdot.org">slashdot.org</a>:</p>
<blockquote>
<p><em>"I've been keeping an eye on this viral marketing campaign called <a href="http://www.petitelapgiraffe.com/">Petite Lap Giraffe</a> — it's the DirecTV ads with the Russian guy and the tiny giraffe. I was <a href="/blog/2011/03/petite-lap-giraffes/">pretty quick to debunk the existence of the giraffes</a>, so a lot of people have been visiting my blog as a result. Today, I noticed a New-York area newspaper that was represented my research as their own, so I asked them to link to my blog (i.e. provide attribution). <a href="/blog/2011/03/total-bummer-longislandpress-com-plagiarism-and-coverup/">What ended up happening</a> perfectly illustrates that newspapers just don't understand how the Internet works..."</em></p>
</blockquote>
<p>The real break occurred at 11:27PM, when <a href="http://slashdot.org/submission/1513476/newspaper-plagiarizes-blog-taunts-real-author">the story was featured on the front page of slashdot</a>.  This brought on <a href="https://secure.wikimedia.org/wikipedia/en/wiki/Slashdotting">a flood of attention</a>, but it seemed like many people weren't buying my account of events.  I have to admit: I thought these critics had a point. Again, questions crept into my consciousness: What if I got it all wrong?  What if LongIslandPress.com really did conduct their own research?  <em>Freak Out!!</em></p>
<p>I spent the next 3 hours responding to criticisms, but at 2:29AM on March 30, <em>Another Slashdotter</em> posted <a href="/blog/2011/03/total-bummer-longislandpress-com-plagiarism-and-coverup/#comment-645">the following comment</a>:</p>
<blockquote>
<p>Have you looked through your logs to see if anybody from their domain name/IP address visited your blog right before the article was published?</p>
</blockquote>
<p>Obviously!  The Internet horde definitely needs to hear about the logs!  Earlier in the evening, I had been watching the real-time logs for a project, and I remembered seeing a visitor from the Long Island area.  I had actually done a reverse-DNS lookup at that time, and it turned out to have originated at the hostname <em>mail.longislandpress.com</em>, so this detail was lurking in my memory. In other words, I had a pretty good hunch about what IP address to look for in my personal blog's server access logs.</p>
<p><strong>The logs contained the smoking gun</strong>, and these are the two entries originating from <em>mail.longislandpress.com</em> that sealed the deal:</p>
<pre><code>XXX.XXX.XXX.XX – - [28/Mar/2011:20:56:31 +0000] “GET /favicon.ico HTTP/1.0″ 304 – “-” “Mozilla/5.0 
    (Windows NT 5.1; rv:2.0) Gecko/20100101 Firefox/4.0″&lt;/code&gt;

XXX.XXX.XXX.XX - - [29/Mar/2011:19:40:30 +0000] "GET /blog/2011/03/total-bummer-longislandpress-
    com-plagiarism-and-coverup/ HTTP/1.0" 200 13398 "http://www.longislandpress.com/[redacted 
    wordpress admin.php]" "Mozilla/5.0 (Windows NT 5.1; rv:2.0) Gecko/20100101 Firefox/4.
</code></pre>
<p>In English, my logs contained records indicating LongIslandPress.com had visited my website at 4:56PM, just 6 minutes before they published their article.  This is why it was so important that the LongIslandPress.com article was published at 5:02PM on March 28th.</p>
<p>Here is the technical interpretation of the first log entry:</p>
<blockquote>
<p>[the favicon.ico] was served with an HTTP 304 code (meaning “unmodified”) which suggests the favicon was already in someone’s cache. That means the page had previously been loaded.  The timestamp is 20:56:31 UTC, meaning it was 4:56PM in New York. The timestamp on the original Long Island Press article is 5:02PM.</p>
</blockquote>
<p>And here is the interpretation of the second entry:</p>
<blockquote>
<p>Someone:</p>
<ul>
<li>using the same IP address as the [favicon.ico] log entry</li>
<li>using the same browser as before (or at least providing the same UserAgent)</li>
<li>using the LIP wordpress admin interface (as indicated by the Referer field)</li>
<li>...clicked through to my site, in order to read [the article about LongIslandPress.com]</li>
</ul>
</blockquote>
<p>This satisfied everybody, and finally I could get some rest.  The next major event occurred around 9:29AM on March 30, when LongIslandPress.com took their article offline.  At 3:45PM, LongIslandPress.com put the original article online again (minus the remarks), which included attribution.</p>
<p>Since then, I've been keeping an eye on LongIslandPress.com, and at 10:27AM on March 31, they posted an article <a rel="nofollow" href="http://www.longislandpress.com/2011/03/31/long-island-bodybuilder-stars-in-directv-commercial/">detailing a local Long Island connection</a> to the Petite Lap Giraffe thing.  So that's why LongIslandPress.com had written so many articles about the Petite Lap Giraffe!  (this really satisfied a nagging question I had been puzzling over for several days).</p>]]></content>
  </entry>
  <entry>
    <author>
      <name></name>
      <uri>http://www.iandennismiller.com/blog</uri>
    </author>
    <title type="html"><![CDATA[When a newspaper "rips off" your blog, then taunts you about it...]]></title>
    <link rel="alternate" type="text/html" href="http://www.iandennismiller.com/blog/2011/03/total-bummer-longislandpress-com-plagiarism-and-coverup/" />
    <id>http://www.iandennismiller.com/blog/2011/03/total-bummer-longislandpress-com-plagiarism-and-coverup/</id>
    <updated>2011-03-29T15:39:29Z</updated>
    <published>2011-03-29T15:39:29Z</published>
    <category scheme="http://www.iandennismiller.com/blog" term="journalism" />
    <category scheme="http://www.iandennismiller.com/blog" term="observation" />
    <category scheme="http://www.iandennismiller.com/blog" term="bummer" />
    <summary type="html"><![CDATA[When a newspaper "rips off" your blog, then taunts you about it...]]></summary>
    <content type="html" xml:base="http://www.iandennismiller.com/blog/2011/03/total-bummer-longislandpress-com-plagiarism-and-coverup/"><![CDATA[<p>I've been having a lot of fun with this Petite Lap Giraffe thing, but I came across a totally fascinating situation today.  I discovered a New York-area newspaper that was lifting details <a href="/blog/2011/03/petite-lap-giraffes/">from my Petite Lap Giraffe article</a> without providing a link.  When I called them out on it, <strong>instead of simply linking to my blog, they rewrote the article to cover up the deed!</strong></p>
<p><strong>Update:</strong> <a href="/blog/2011/03/the-longislandpress-com-giraffe-gaffe/">Click here for a complete summary of the events</a>.  (It's probably easier to read.)</p>
<p>So here's the long version of the story...  I was really disappointed to see LongIslandPress.com <a rel="nofollow" href="http://www.longislandpress.com/2011/03/28/petite-lap-giraffes-real-or-directv-marketing-campaign/">ripping off my blog without attribution</a>.</p>
<p>To paraphrase Missy Yates, the author of the article:</p>
<blockquote>
<p>Sorry guys ... petite lap giraffe just doesn’t exist in our realm. ... you’ve SEEN them. We did too. <strong>But let’s do a little research here.</strong></p>
</blockquote>
<p>From there, Yates summarizes the findings <a href="http://www.iandennismiller.com/blog/2011/03/petite-lap-giraffes/">from my blog post</a>, sans useful details, and most importantly without any attribution.  Oh, but they included advertising.</p>
<p>At 2:44PM I commented on the article, saying:</p>
<blockquote>
<p>I'm disappointed you didn't cite my March 23 blog post, where I actually conducted the research you are taking credit for.  Specifically, I uncovered the link to the Grey Group, and I also discovered the "hot tub" stock image.  At a minimum, you should provide attribution.</p>
</blockquote>
<p>This is where it gets interesting. At the time of my comment (2:44PM), the Long Island Press article stated the following:</p>
<blockquote>
<p>The Russian Petite Lap Giraffes featured on www.petitelapgiraffe.com can be traced back to a marketing group: Grey Global Group, which we’re guessing has some kind of connection to DirecTV.</p>
<p><strong>And the cute little guy in the bath tub? Well, that’s a stock image with the cute little guy added in.</strong></p>
</blockquote>
<p>However, by 3:05PM the article was altered to read:</p>
<blockquote>
<p>The Russian Petite Lap Giraffes featured on www.petitelapgiraffe.com can be traced back to a marketing group: Grey Global Group, which we’re guessing has some kind of connection to DirecTV.</p>
<p><strong>A quick domain name lookup…which is free and public information…will give you those details.</strong></p>
</blockquote>
<p>No kidding!  They removed the detail about the stock image, since that's something that originated on my blog and nowhere else.  Regarding the domain name, it <strong>is</strong> free and public information, but hey ... I did it first, and I published it in the same article as the stock image exposé.  The point is that <strong>it's still a "rip off"</strong> to copy someone else's work and take credit for it as if it were your own.  Best of all, <strong>they refused to publish my comment</strong>; it is still being "held for moderation."</p>
<p>As of 4:23PM, the article now includes this language:</p>
<blockquote>
<p>A quick domain name lookup…which is free and public information…will give you those details, <strong>which we acquired–you know, being a newspaper with research capabilities and all–of our own accord (although some are trying to claim this information as their own “discovery” as a way to promote their own personal website! But enough of that…)</strong></p>
</blockquote>
<p>Wow.  Just wow.  They didn't need to be jerks about it, but ... here we are.  So I am left wondering about the following things:</p>
<ul>
<li>Why rip off my work without providing a link?  A link is free, and it's easy to do.</li>
<li>Why alter the article to make it appear as if the "rip off" had not taken place?  I interpret the actions of Long Island Press to constitute the acknowledgment that it is, in fact, a "rip off", and simultaneously that they're totally remorseless about doing it.  Booo!!!</li>
<li>I called out Yates in their comments section, but it's been "held for moderation" for the last hour.  In that time, they've actually changed the article, but they haven't published my comment.  What's going on there!?</li>
</ul>
<p>Like I said, it's just kindof a bummer...  but for a newspaper to go out of their way - to actually alter an article they published in an effort to rewrite history - just so they don't have to respond to a little (valid) criticism?  That's kindof wild!  Let me be clear: this whole thing is so silly that I don't actually want any kind of reparations.  I don't want anyone to be fired or anything like that...  but I also hate it when people get busted and then try to cover it up!  ...and then to be jerks about it - that sucks!</p>
<p><img alt="Screenshot from Long Island Press: my comment is still awaiting moderation..." src="/assets/blog_images/screenshot2.png" /></p>
<p>Here's what <a href="https://secure.wikimedia.org/wikipedia/en/wiki/Plagiarism">wikipedia has to say about plagiarism</a>:</p>
<blockquote>
<p><strong>Plagiarism</strong> is defined in dictionaries as "the wrongful appropriation, close imitation, or purloining and publication, of another author's language, thoughts, ideas, or expressions, and the representation of them as one's own <a href="https://secure.wikimedia.org/wikipedia/en/wiki/Original_work">original work</a>."</p>
</blockquote>
<p><strong>Update</strong>: Slashdot, you rule.  Yes, there are a bunch of trolls here, and yes, this is about the most absurd situation anyone could have dreamt up...  but let me say this: I <strong>really</strong> don't want anyone to get fired over this.  Come on: it's a freaking mythical creature called the "Petite Lap Giraffe."  It's fun, it's stupid, etc... relax, everybody!</p>
<p><strong>Update</strong>: I want to point out that we don't know who edited the article.  All we know is the original author.  Therefore, please refrain from any kind of personal accusations targeted at the author; anyone at the LIP could have made the edits.</p>
<p><strong>Update</strong>: Since someone asked about my server logs, the answer is: yes, I checked them out.  On March 28 (the date their article was published) I did log one request for favicon.ico that originated at mail.longislandpress.com. Here it is:</p>
<pre><code>XXX.XXX.XXX.XX – - [28/Mar/2011:20:56:31 +0000] “GET /favicon.ico HTTP/1.0″ 304 – “-” 
    “Mozilla/5.0 (Windows NT 5.1; rv:2.0) Gecko/20100101 Firefox/4.0″
</code></pre>
<p>It was served with an HTTP 304 code (meaning “unmodified”) which suggests the favicon was already in someone’s cache. That means the page had previously been loaded.  The timestamp is 20:56:31 UTC, meaning it was 4:56PM in New York. The timestamp on the original Long Island Press article is 5:02PM.</p>
<p><strong>To put it in a simpler way: someone from longislandpress.com visited my site less than 10 minutes before they published the article in question.  I have to admit I didn't expect the timestamps to be so close to each other, but... there they are!</strong></p>
<p><strong>Update</strong>: I kept going through the logs, and what do you know...  I noticed this entry, which originated from the same IP address as the previous entry:</p>
<pre><code>XXX.XXX.XXX.XX - - [29/Mar/2011:19:40:30 +0000] "GET /blog/2011/03/total-bummer-longislandpress-com-
    plagiarism-and-coverup/ HTTP/1.0" 200 13398 "http://www.longislandpress.com/[redacted wordpress
    admin.php]" "Mozilla/5.0 (Windows NT 5.1; rv:2.0) Gecko/20100101 Firefox/4.0"
</code></pre>
<p>Let me unpack this for you.  Someone:</p>
<ul>
<li>using the same IP address as the previous log entry</li>
<li>using the same browser as before (or at least providing the same UserAgent)</li>
<li>using the LIP wordpress admin interface (as indicated by the Referer field)</li>
<li>...clicked through to my site, in order to read this post</li>
</ul>
<p><strong>So this suggests that someone who has access to the longislandpress.com wordpress admin interface also visited my site 6 minutes before publishing the article I contacted them about.</strong></p>
<p><strong>Update:</strong> Welp, it looks like they took their article offline.  I consider this to be pretty much a wrap, by now.  Thanks, everybody!</p>
<p><strong>Update</strong>: LongIslandPress.com put the <a rel="nofollow" href="http://www.longislandpress.com/2011/03/28/petite-lap-giraffes-real-or-directv-marketing-campaign/">article back online, with a attribution</a>.  I'm happy enough.</p>]]></content>
  </entry>
  <entry>
    <author>
      <name></name>
      <uri>http://www.iandennismiller.com/blog</uri>
    </author>
    <title type="html"><![CDATA[Petite Lap Giraffes: Real?]]></title>
    <link rel="alternate" type="text/html" href="http://www.iandennismiller.com/blog/2011/03/petite-lap-giraffes/" />
    <id>http://www.iandennismiller.com/blog/2011/03/petite-lap-giraffes/</id>
    <updated>2011-03-23T21:56:22Z</updated>
    <published>2011-03-23T21:56:22Z</published>
    <category scheme="http://www.iandennismiller.com/blog" term="funny" />
    <category scheme="http://www.iandennismiller.com/blog" term="observation" />
    <category scheme="http://www.iandennismiller.com/blog" term="found" />
    <summary type="html"><![CDATA[Petite Lap Giraffes: Real?]]></summary>
    <content type="html" xml:base="http://www.iandennismiller.com/blog/2011/03/petite-lap-giraffes/"><![CDATA[<p>Okay - I have to admit I was totally amazed to see a real, live Petite Lap Giraffe walking around on a live video feed, straight from Russia. <a href="http://www.petitelapgiraffe.com">See here!</a> I was so amazed, in fact, that I had to figure out for sure if they were real or not.</p>
<p>For starters, I noticed <a href="http://www.petitelapgiraffe.com/">petitelapgiraffe.com</a> was registered a month ago by <a href="http://en.wikipedia.org/wiki/Grey_Global_Group">Grey Global Group</a>, a New York Marketing firm.</p>
<pre><code>$ whois petitelapgiraffe.com
...
Administrative Contact, Technical Contact:
Grey Global Group
200 5th Ave
4th Fl
NEW YORK, NY 10010
US
212-546-1824 fax: 123 123 1234
Record expires on 15-Feb-2012.
Record created on 15-Feb-2011.
</code></pre>
<p>That wasn't quite enough to relegate the giraffe to myth-hood, though. If they were leaving their domain name swinging in the wind, they must have slipped up somewhere else.  To the batcave!  (by which I mean image metadata).  Take a look at <a href="http://www.petitelapgiraffe.com/photos.php">the pictures on the website</a>.  In particular, look at this one:</p>
<p><img alt="giraffe in hot tub" src="/assets/blog_images/image4.jpeg" /></p>
<p>Just look at it.  It's a cute, petite lap giraffe in a luxurious marble bath!</p>
<p>Oh, wait.  No, it's not.</p>
<p><img alt="corbis image rights" src="/assets/blog_images/Screen-shot-2011-03-24-at-2.11.26-PM.png" /></p>
<p>It's a stock Corbis image, catalog number 42-25705449.  See here (<a href="http://www.corbisimages.com/Enlargement/42-25705449.html">http://www.corbisimages.com/Enlargement/42-25705449.html</a>) for comparison:</p>
<p><img alt="corbis catalog number 42-25705449" src="/assets/blog_images/Open-bathroom-in-rustic-villa.jpeg" /></p>
<p>I know, I know...  I wanted a Petite Lap Giraffe too...  My best guess is that it's a DirecTV marketing campaign.  There is an <a href="https://secure.wikimedia.org/wikipedia/en/wiki/Uncanny_valley">Uncanny Valley Giraffe</a> running on a treadmill in <a href="http://www.youtube.com/watch?v=-vHT6b7u1_Y">one of their videos</a>, and it's so adorable it just has to be a million polygons in a rendering farm somewhere.</p>]]></content>
  </entry>
  <entry>
    <author>
      <name></name>
      <uri>http://www.iandennismiller.com/blog</uri>
    </author>
    <title type="html"><![CDATA[Using consistent key mappings across OS X applications]]></title>
    <link rel="alternate" type="text/html" href="http://www.iandennismiller.com/blog/2010/11/using-consistent-key-mappings-across-os-x-applications/" />
    <id>http://www.iandennismiller.com/blog/2010/11/using-consistent-key-mappings-across-os-x-applications/</id>
    <updated>2010-11-13T11:47:24Z</updated>
    <published>2010-11-13T11:47:24Z</published>
    <category scheme="http://www.iandennismiller.com/blog" term="Technology" />
    <summary type="html"><![CDATA[Using consistent key mappings across OS X applications]]></summary>
    <content type="html" xml:base="http://www.iandennismiller.com/blog/2010/11/using-consistent-key-mappings-across-os-x-applications/"><![CDATA[<p>This morning, I was delighted to see an update for <a href="http://macromates.com/">Textmate</a> - but as I scanned the release notes, I was temporarily irked to learn the key combinations for tabbing between files had been changed.  "Tabs" are a major user interface element that makes web browsing and code editing into a more effective experience, but it's really annoying when different programs use different keys to do the same thing.  In this case, Google Chrome, Terminal.app, and TextMate have all used slightly different mappings at different times, and life would be so much easier if they were all the same.  Fortunately, OS X makes it very easy to control keystroke combinations, so I quickly changed Textmate to use my familiar tab forward/backward configuration - and I'll show you how to do this yourself.</p>
<h2>Keyboard Shortcuts to the rescue</h2>
<p>OS X provides an interface for mapping <strong>any</strong> key combination onto <strong>any</strong> menu item.  In terms of an application like TextMate, when you see a pull-down menu option like "Next File Tab", you can change the keys for that item to any combination you want.</p>
<p><img alt="Textmate, Next Tab" src="/assets/blog_images/Textmate-Next-Tab.png" /></p>
<p>Start by opening the Keyboard control in System Preferences</p>
<p><img alt="System Preferences, Keyboard" src="/assets/blog_images/System-Preferences-Keyboard.png" /></p>
<p>Then click on the Keyboard Shortcuts tab at the top of the window, then select Application Shortcuts in the left column.</p>
<p><img alt="Keyboard Shortcuts, Application Shortcuts" src="/assets/blog_images/Keyboard-Shortcuts-Application-Shortcuts.png" /></p>
<p>Next, click on the + icon, which will enable you to create a new mapping.</p>
<p><img alt="Application, Menu Title" src="/assets/blog_images/Application-Menu-Title.png" /></p>
<p>In the screenshot above, it says "tell TextMate to call the Next File Tab option whenever I press a certain keyboard combination."  You can add any application you want by pulling down Application, then finding "other" at the very end of the list:</p>
<p><img alt="Other Application&quot;" src="/assets/blog_images/Other-Application.png" /></p>
<p>And that does the trick!  TextMate behaves exactly like I want, even though they changed the key mappings with the latest release.  This principle generalizes to almost any OS X application out there, with certain notable exceptions like X11 programs.</p>
<h2>What makes a good key combination?</h2>
<p>This is a rather philosophical question, but one framework approaching it is this: the most common commands should be the easiest to type.  This property can be measured in terms of whether you can do the keystroke one-handed (e.g. cmd-s) versus two-handed (cmd-^), or in terms of how far you have to stretch your fingers to reach the keys (cmd-z versus cmd-y).  I am using cmd-alt-arrow keys to control my tabs, which is a two-handed combination.  I might even consider switching to something simpler, but I'm used to this combination by now, and I use the same combination in several different applications...  so it's probably here to stay.</p>
<p>Now that you know how to control any application, you can normalize between applications too.  This process is so simple, and it can relieve so many little headaches.  Enjoy!</p>]]></content>
  </entry>
  <entry>
    <author>
      <name></name>
      <uri>http://www.iandennismiller.com/blog</uri>
    </author>
    <title type="html"><![CDATA[handy utility: watchpaths]]></title>
    <link rel="alternate" type="text/html" href="http://www.iandennismiller.com/blog/2010/07/handy-utility-watchpaths/" />
    <id>http://www.iandennismiller.com/blog/2010/07/handy-utility-watchpaths/</id>
    <updated>2010-07-24T22:44:42Z</updated>
    <published>2010-07-24T22:44:42Z</published>
    <category scheme="http://www.iandennismiller.com/blog" term="Technology" />
    <summary type="html"><![CDATA[handy utility: watchpaths]]></summary>
    <content type="html" xml:base="http://www.iandennismiller.com/blog/2010/07/handy-utility-watchpaths/"><![CDATA[<p>Imagine you are working on a set of files on your computer, and each time you change one of those files, you want to run a program to process the files again. This comes up all over the place, whether it's software development, statistics, image processing, or lots of other domains. Recently, I was editing some source code, and each time I changed a file, I wanted to run a series of tests to make sure everything still worked. I made this process automatic with the help of a really handy utility called watchpaths.</p>
<h2>Installing watchpaths</h2>
<p>First, download watchpaths and place it somewhere in your path. I use ~/bin, so try something like this:</p>
<pre><code>cd ~/bin
wget https://github.com/iandennismiller/watchpaths/raw/master/bin/watchpaths.py
chmod 755 ~/bin/watchpaths.py
</code></pre>
<h2>Using watchpaths</h2>
<p>Let's say I want to monitor a folder containing images, and each time a new image is added I want to sync the folder to a remote computer. Using watchpaths, that will look like:</p>
<pre><code>watchpaths.py "rsync -a ~/my_pictures user@example.com:public_html" ~/my_pictures
</code></pre>
<p>To convert that command into English, it would sound like this:</p>
<blockquote>
<p>"Watch the my_pictures folder for any changes (new files, deleted files, updated files, etc) and each time a change happens in that folder, synchronize the contents of that folder with my web server."</p>
</blockquote>
<h2>More Information</h2>
<p>The project page, including links for downloads, is <a href="/projects/watch_paths.html">here</a>.  If there is any interest, I am happy to incorporate feedback.</p>]]></content>
  </entry>
  <entry>
    <author>
      <name></name>
      <uri>http://www.iandennismiller.com/blog</uri>
    </author>
    <title type="html"><![CDATA[Passwords, and the Apple Keychain]]></title>
    <link rel="alternate" type="text/html" href="http://www.iandennismiller.com/blog/2010/05/passwords-and-the-apple-keychain/" />
    <id>http://www.iandennismiller.com/blog/2010/05/passwords-and-the-apple-keychain/</id>
    <updated>2010-05-19T11:47:18Z</updated>
    <published>2010-05-19T11:47:18Z</published>
    <category scheme="http://www.iandennismiller.com/blog" term="Technology" />
    <summary type="html"><![CDATA[Passwords, and the Apple Keychain]]></summary>
    <content type="html" xml:base="http://www.iandennismiller.com/blog/2010/05/passwords-and-the-apple-keychain/"><![CDATA[<p>Some time around 2006, I started thinking about my online passwords in a new way. Until this point, I had used a collection of perhaps a dozen gibberish passwords, which I reused on various sites depending on the sensitivity of the site. For example, my bank account would use a nearly unique password, whereas a random forum would use a very commonly reused password.</p>
<p>This worked acceptably well, but I frequently had to ask myself: "which password did I use when I signed up for this service?" In response to having to guess my own passwords, I made two decisions: I would start writing my passwords down, and I would make them all unique and randomly generated.  Four years later, I am using a totally different system, and I'll explain all of my reasoning.</p>
<p>To facilitate my random password approach, I started using 3x5 index cards and a card filer. I added A-Z tabs, and I generally filed cards according to the domain name of the service (e.g. paypal.com is filed under P). I wrote a quick perl script to make 10 random passwords at a time, and I would pick one from the list and write it down on the index card. I really liked the concept of a purely non-digital password storage system, because it would be essentially unhackable without physical access. <em>Essentially unhackable</em> - more on this later.</p>
<p>There were several drawbacks to the index card system. For brevity, I'll just list them:</p>
<ul>
<li>
<p>writing some characters by hand is ambiguous. I confused capital I, lowercase L, and numeral 1 all the time. Capital O and numeral 0 are also a trick.</p>
</li>
<li>
<p>it's possible to copy the password incorrectly</p>
</li>
<li>
<p>it is extremely difficult to create a backup copy, so catastrophic loss is a possibility</p>
</li>
<li>
<p>if someone has physical access to the index cards, they have access to your accounts</p>
</li>
<li>
<p>it's tedious to type in a random password every time you log in</p>
</li>
<li>
<p>it doesn't scale well after about 400 accounts</p>
</li>
</ul>
<p>The scaling problems were the real killer. For example, did I file sandbox.paypal.com under P for paypal or S for sandbox? I don't remember, so I need to perform a linear search through both letters.  Or, since a disproportionate number of words start with S, then it became a more tedious task to flip through all the S cards in order to find an S site, whereas a site that started with Y would be pretty quick to look up since there were fewer. Eventually, it got to the point that I knew it was too much of a chore to look up cards, and on that basis, I became too lazy to log in to my accounts! Total failure.</p>
<p><img alt="keychain icon" src="/assets/blog_images/Keychain-Icon.png" /></p>
<p>The solution for me is to use <a href="http://en.wikipedia.org/wiki/Apple_Keychain">Apple Keychain</a>. If you're a <a href="http://en.wikipedia.org/wiki/Getting_things_done">GTD adherent</a>, then you'll understand what I mean when I say this is my trusted system for account information. How did I reconcile a digital password storage with my original goal of keeping my passwords offline in order to make it unhackable? It was when I realized that both offline passwords and the keychain can be successfully attacked with a keystroke logger. If someone went to those lengths to get a password, then it wouldn't matter how it was originally stored; the password could be intercepted regardless.</p>
<p>Why use Apple Keychain? Based on my list of drawbacks for the index cards, here's a list of pro-Keychain points:</p>
<ul>
<li>
<p>built-in random password generator</p>
</li>
<li>
<p>keyword search</p>
</li>
<li>
<p>simple cut-and-paste workflow makes it very easy to enter passwords without typing</p>
</li>
<li>
<p>keychain itself is password protected</p>
</li>
<li>
<p>passwords are <a href="http://en.wikipedia.org/wiki/Triple_DES">Triple DES</a> encrypted (which should be acceptable until the year 2030)</p>
</li>
<li>
<p>simple to back up keychain file</p>
</li>
<li>
<p>slick integration with many applications, including Mail.app, subversion, and Safari/Chrome.</p>
</li>
</ul>
<p>I'm currently at about 900 accounts (yes - this is deserving of a separate post unto itself) and the system is working great. I think this scales to meet my requirements, and probably beyond. In practical terms, a password that used to take 30 second to retrieve is now instant.  I probably save 5 minutes per day by switching away from index cards, and I am avoiding untold frustrations.  In all, I recommend Apple Keychain highly.</p>]]></content>
  </entry>
</feed>

