tag:quandyfactory.com,2010-7-27:/20107272010-7-27T12:00:00ZQuandy Factory Newsfeed - ProjectsQuandy Factory is the personal website of Ryan McGreal in Hamilton, Ontario, Canada..http://quandyfactory.com/projects/56/makechart2010-04-27T12:00:00ZMakeChart
<p>Check out the <a href="http://github.com/quandyfactory/MakeChart">Github repository</a> for source and documentation.</p>
<p>Here's an example use, assuming you save makechart.py somewhere in your PATH:</p>
<pre><code>try: import json
except: import simplejson as json
import urllib
import makechart
# chart with world petroleum production data by month from EIA
url = 'http://quandyfactory.com/json/makechart'
output = urllib.urlopen(url)
contents = output.read()
dataset = json.loads(contents)
caption = 'World Oil Production by Month, 2001-2010<br>(Source: EIA)'
unit = 'mbpd'
chart = makechart.make_chart(dataset, caption, unit)
html = makechart.make_html(chart)
file = open('makechart_example.html', 'w')
file.write(html)
file.close
</code></pre>
<p>Note: for the sake of convenience, this example uses a sample dataset in JSON format that is <a href="/json/makechart">hosted on this website</a>.</p>
Ryan McGreal2http://quandyfactory.com/projects/40/pytoc2010-04-15T12:00:00ZPyToc
<h3>Introduction</h3>
<p>PyToc generates a table of contents for an HTML document based on headings, with anchor links from the TOC to specific headings.</p>
<h3>Download</h3>
<p>You can download the latest version of PyToc from its github repository:</p>
<ul>
<li><a href="http://github.com/quandyfactory/PyToc">http://github.com/quandyfactory/PyToc</a></li>
</ul>
<h3>Requirements</h3>
<ul>
<li>Python 2.5 or 2.6</li>
<li><a href="http://www.crummy.com/software/BeautifulSoup/">BeautifulSoup</a> 3.0.8</li>
</ul>
<h3>Using PyToc</h3>
<p>You can see the code in action on this very page!</p>
<p>It's pretty simple to use. <a href="http://github.com/quandyfactory/PyToc">Download</a> <code>pytoc.py</code> and save it somewhere in your PATH. </p>
<p>Here's a demonstration:</p>
<pre><code>>>> import urllib
>>> import pytoc
>>> url = 'http://quandyfactory.com/projects/40/pytoc'
>>> page = urllib.urlopen(url)
>>> html = page.read()
>>> toc = pytoc.Toc(html_in=html)
>>> toc.make_toc()
True
>>> toc.html_toc # returns an HTML table of contents
>>> toc.html_out # returns the html with anchors and numbering in headings
>>> toc.toc_list # returns a list of tuples in the form (section number, title)
</code></pre>
<h4>Input Properties</h4>
<p>The following are input properties you enter to generate the table of contents.</p>
<ul>
<li><p><code>html_in</code> - The HTML document for which you want to generate a table of contents.</p>
<p>This is the only necessary property to assign. The rest have default values that may meet your needs.</p></li>
<li><p><code>levels</code> - A list of numbers corresponding to the heading levels you want to include in your TOC.</p>
<p>E.g. [3, 4] would include <code><h3></code> and <code><h4></code> headings. Default is [3, 4].</p></li>
<li><p><code>id</code> - The base id of the HTML table of contents to be generated. Default is "toc".</p></li>
<li><p><code>title</code> - The title of the generated table of contents. Default is "Contents".</p></li>
</ul>
<h4>Methods</h4>
<ul>
<li><code>make_toc()</code> - this generates the table of contents and populates the output properties. Returns True when complete.</li>
</ul>
<h4>Output Properties</h4>
<p>After calling the <code>make_toc()</code> method, the following output properties are populated with values.</p>
<ul>
<li><p><code>html_out</code> - The same as <code>html_in</code> except with the TOC anchors and numbering included in the headings.</p></li>
<li><p><code>html_toc</code> - The generated HTML table of contents.</p></li>
<li><p><code>toc_list</code> - A list of tuples containing the anchors and headings, in case you would rather roll your own HTML table of contents.</p></li>
</ul>
<p>That's it, really.</p>
<h3>History</h3>
<p>This library started out as one-off code to generate a table of contents for a long document that I'm converting from MS Word format over to HTML. </p>
<p>The existing Word document has had many contributors and editors over the years and the format is a shambles. The table of contents is a mess and the headings are all over the place. (Thanks, WYSIWYG.) </p>
<p>I converted the whole thing into <a href="http://daringfireball.net/projects/markdown/" title="John Gruber's Markdown">Markdown</a>, a simple plain-text formatting syntax that converts to clean, structural HTML. </p>
<p>I still wanted the final document to have a table of contents, but I didn't want to have to go through the bother of maintaining the thing - especially if I ever wanted to add a new section in the middle, which would require a re-numbering of all the subsequent sections and subsections.</p>
<p>I whipped up a simple parser that walked the document and dynamically generated a table of contents, plus anchors in the document so the section headings listed in the contents could straight down to the sections themselves.</p>
<h3>Need for Flexibility</h3>
<p>It worked, but the code was brittle. It required the HTML formatting to be very strict, e.g. the following would work:</p>
<pre><code><h3>Some subheading title</h3>
</code></pre>
<p>but the following would not work:</p>
<pre><code><h3>
Some subheading title
</h3>
</code></pre>
<p>Documents generated using <a href="http://code.google.com/p/python-markdown2/">python-markdown2</a> would work pretty consistently, but documents generated using, say, <a href="http://www.freewisdom.org/projects/python-markdown/">python-markdown</a> would not, since the latter produces messier HTML output. </p>
<p>Of course, with documents produced using other means, all bets were off.</p>
<p>Anyway, I decided that if this was going to be at all useful as a general-purpose tool, it needed to be more flexible and forgiving. Of course, if you're programming in python, parsing HTML and want to be flexible and forgiving, there's no better tool than the mighty <a href="http://www.crummy.com/software/BeautifulSoup/">BeautifulSoup</a> parsing library.</p>
<p>So I re-wrote it using BeautifulSoup.</p>
Ryan McGreal2http://quandyfactory.com/projects/51/pycouchcontacts2010-04-12T12:00:00ZPyCouchContacts
<p>Coming soon.</p>
Ryan McGreal2http://quandyfactory.com/projects/49/gitiot2010-04-07T12:00:00ZGitiot
<p>Gitiot is a really simple cross-platform GUI wrapper for the most minimal useful subset of git's awesome power; i.e. one-button commit and push-to-master for people who want revision control but don't want to learn the command line.</p>
<p>Download the repository on <a href="http://github.com/quandyfactory/Gitiot">github</a>.</p>
Ryan McGreal2http://quandyfactory.com/projects/48/download_tweets2010-04-01T12:00:00ZDownload Tweets
<h3>Introduction</h3>
<p>A simple python script to download all the tweets for a given Twitter username.</p>
<p>Download the script and documentation from its <a href="http://github.com/quandyfactory/download_tweets">github repository</a>.</p>
<h3>Notes</h3>
<ul>
<li><p>The twitter API will only let you download <a href="http://apiwiki.twitter.com/Things-Every-Developer-Should-Know">the most recent 3,200 tweets</a>. (Don't worry - all your tweets are still in their database. They eventually plan to make them all available.)</p></li>
<li><p>The Twitter API also <a href="http://apiwiki.twitter.com/Rate-limiting">limits the number of data requests</a> to 150 per hour. At 20 tweets per page, that means you're actually limited to 3,000 tweets.</p></li>
</ul>
<h3>Exciting Update</h3>
<p>My first fork! On a request from <a href="http://twitter.com/adr/status/11427399262">John Fink</a>, <a href="http://twitter.com/parlar">Jay Parlar</a> was nice enough to <a href="http://github.com/parlarjb/download_tweets">fork Download Tweets</a> and add optional command line username and filemane arguments. </p>
<p>As soon as I get a chance, I'll merge his additions into my branch.</p>
Ryan McGreal2http://quandyfactory.com/projects/44/hamilton_spectator_pdf_downloader2010-03-23T12:00:00ZHamilton Spectator PDF Downloader
<p><style type="text/css">
div.formdiv { text-align: center; clear: both; width: 12em; }
.formdiv span { float: left; display: block; width: 5em; text-align: left; }
span.label { text-align: right; margin-right: 5px; }</p>
<p></style></p>
<form>
<div class="formdiv">
<span class="label">Year: </span>
<span> <input id="spec_year" name="spec_year"></span>
</div>
<div class="formdiv">
<span class="label">Month: </span>
<span><input id="spec_month" name="spec_month"></span>
</div>
<div class="formdiv">
<span class="label">Day: </span>
<span> <input id="spec_day" name="spec_day"></span>
</div>
<div class="formdiv">
<span class="label">Page: </span>
<span> <input id="spec_page" name="spec_page" value="A1"></span>
</div>
<div class="formdiv">
<input type="button" name="spec_submit" value="Get PDF Page" onclick="spec_get_pdf(); return false;">
</div>
</form>
<script type="text/javascript">
var d=new Date();
var year = d.getFullYear() + '';
var month = d.getMonth() + 1 + '';
if (month.length == 1) {
month = '0' + month ;
}
var day = d.getDate() + '';
if (day.length == 1) {
day = '0' + day;
}
document.getElementById('spec_year').value = year;
document.getElementById('spec_month').value = month;
document.getElementById('spec_day').value = day;
function spec_get_pdf() {
var year = document.getElementById('spec_year').value;
var month = document.getElementById('spec_month').value;
var day = document.getElementById('spec_day').value;
year = set_length(year,4);
month = set_length(month,2);
year = set_length(year,2);
var page = document.getElementById('spec_page').value;
page = page.toUpperCase();
var specstring = 'http://www.hamiltonspectator.com/pdfs/'
var link = specstring+year+month+day+'/'+page+'.pdf';
location.href = link;
}
function set_length(val, len) {
var zeroes = '';
var diff = len - val.length;
if (diff > 0) {
for (var i=0;i<diff;i++) {
zeroes += '0';
}
}
return zeroes+val;
}
</script>
Ryan McGreal2http://quandyfactory.com/projects/32/stack_trace_for_hnshpy_on_proxy2010-01-07T12:00:00ZStack Trace for hnsh.py on proxy
<h3>The Stack Trace</h3>
<p>Here's the original stack trace (Python 2.6 on a Windows XP machine behind a proxy server) for <a href="http://scottjackson.org/software/hnsh/">Hacker News Shell</a>.</p>
<pre><code>C:\Python26>python hnsh.py
Traceback (most recent call last):
File "hnsh.py", line 558, in <module>
hnsh = HackerNewsShell()
File "hnsh.py", line 314, in __init__
self.stories = self.h.getLatestStories(self.alreadyReadList)
File "hnsh.py", line 194, in getLatestStories
source = self.getSource("http://news.ycombinator.com")
File "hnsh.py", line 35, in getSource
f = urllib2.urlopen(url)
File "C:\Python26\lib\urllib2.py", line 124, in urlopen
return _opener.open(url, data, timeout)
File "C:\Python26\lib\urllib2.py", line 389, in open
response = self._open(req, data)
File "C:\Python26\lib\urllib2.py", line 407, in _open
'_open', req)
File "C:\Python26\lib\urllib2.py", line 367, in _call_chain
result = func(*args)
File "C:\Python26\lib\urllib2.py", line 1140, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "C:\Python26\lib\urllib2.py", line 1115, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error [Errno 11001] getaddrinfo failed>
</code></pre>
<h3>Update 1: Proxies With Urllib2</h3>
<p>After prompting the user for their proxy server, you'll probably want to do something like this:</p>
<pre><code># Example: proxies is a dict
proxies = { 'http': 'http://proxy.domain.com:80' }
proxy_support = urllib2.ProxyHandler(proxies)
opener = urllib2.build_opener(proxy_support)
urllib2.install_opener(opener)
urllib2.urlopen(url)
</code></pre>
<p>This is untested, but it should work more or less as written.</p>
<h3>Update 2: If No Proxy</h3>
<p>It may be helpful to note further that you can submit an empty <code>proxies</code> dict in the above code if the internet connection is direct:</p>
<pre><code>proxies = {}
</code></pre>
<p>This way, you can use the same code to connect to the URL whether or not the user is behind a proxy server.</p>
<h3>Update 3: hnsh.py on GitHub</h3>
<p>Scott Jackson, author of hnsh.py, moved the code into git and posted the repository on GitHub:</p>
<ul>
<li><a href="http://github.com/scottjacksonx/hnsh">http://github.com/scottjacksonx/hnsh</a></li>
</ul>
<p>Check this out from the <a href="http://github.com/scottjacksonx/hnsh/blob/master/hnsh.py">source code</a>:</p>
<blockquote>
<p>Special thanks to Ryan McGreal (http://github.com/quandyfactory) for the code that makes hnsh work from behind a proxy.</p>
</blockquote>
<p><em>flushes with pride</em></p>
Ryan McGreal2http://quandyfactory.com/projects/38/pycalendar2010-01-05T12:00:00ZPyCalendar
<p>This class has since been rolled into <a href="/projects/5/quandy">Quandy</a>.</p>
Ryan McGreal2http://quandyfactory.com/projects/37/wayne_macphail's_url_lengthener2009-12-22T12:00:00ZWayne MacPhail's URL Lengthener
<p>Wayne MacPhail sent out <a href="http://twitter.com/wmacphail/status/6929150854">a</a> <a href="http://twitter.com/wmacphail/status/6935590394">few</a> <a href="http://twitter.com/wmacphail/status/6936928561">tweets</a> today in which he proposed an "URL lengthener". In a fit of whimsy (leavened by the promise of <a href="/blog/1/productivity_and_procrastination">procrastination</a> from present-wrapping), I decided to build it.</p>
<p>You can find it here:</p>
<ul>
<li><a href="http://quandyfactory.com/longurl">http://quandyfactory.com/longurl</a></li>
</ul>
Ryan McGreal2http://quandyfactory.com/projects/28/rth_codebase_redesign2009-12-20T12:00:00ZRTH Codebase Redesign
<p>The code is now running live on <a href="http://raisethehammer.org">raisethehammer.org</a>. The code is published to this project's <a href="http://github.com/quandyfactory/rth_codebase">GitHub repository</a>.</p>
<p>If you encounter any bugs, please report them to the <a href="http://github.com/quandyfactory/rth_codebase/issues">bug tracker</a>.</p>
Ryan McGreal2