2009-01-16

Baby clojure step number two: my very first mind expansion

So, today, after just a little bit more fiddling getting Clojure/swank/slime working on my desktop box, I was able to actually run into a more Clojurish obstacle. But first I had better explain my little project that I just mentioned yesterday.

I used to read RSS feeds with Gnus, but ever since I switched to using the Gmail IMAP interface with Gnus (rather than fetching mail to my local server and reading it from there), the extra time spent reloading all my feeds every time I wanted to check my mail started to discourage me. And that was the end of nnrss for me. At least for awhile.

The other day I started to think that nnrss might work for me again if I kept local copies of the feeds and used wget or something to keep them up to date. As I started putting a bash script together, I started to realize that my specs were quickly surpassing my bash skills: keeping track of the last time fetched (because wget won't do it if you change the name of the file you download, which you have to do because feeds often have the same filename...), conditionally running Atom feeds through an XSL stylesheet to make them readable for nnrss, and probably more as the list got longer. And I did look around for some kind of ready-made solution to all of this but everything I found was somehow too complete. So just as I was about to start writing some Perl, I thought: "you're in no hurry, do this in Clojure and learn a thing or two". So here I am.

Anyway, step one right now is to read a text file with a list of URLs and a feedname. Next I will add a "type" field or maybe an "atomp" field to pick out those that need furthing parsing. So then we end up looking at ideal material for a Perl one-liner but a metaphysical challenge for the Clojure 00ber-n00b that I am. Yesterday I mentioned my first flailings that got me as far as reading a file, splitting the lines and printing them back out. Heady stuff, I know.

Today's challenge was stuffing everything back into what was to become my feed "database". Using Stuart Halloway's rewrite of Peter Seibel's PCL, specifically the now-famous CD database, I had settled on using a set of structs to contain all my information.

Everything seemed very calm and smooth for awhile. I could write functions for making structs, adding them to the set (#{}). I could read my data file, I was using first and second and felt like I was back in Lisp.

And then suddenly I hit the wall of immutability.

Here is what I wanted to do:

  
(defn parse-rsslist-file [db feedlist-filename]
 (with-open [r (reader feedlist-filename)]
   (doseq [line (line-seq r)]
     (let [url (first (.split line " "))
           nam (second (.split line " "))]
       (add-feeds db (struct feed nam nil url nil)))))
       db)
  

I was stupidly trying to loop through the lines and accumulate the results in db. That is what I would have done in Common Lisp and just about anything else. When my function kept just returning an empty #{}, I finally realized that I was face to face with real functional programming and one of the aspects of Clojure that has been commented the most.

This led to a new exploration of the Clojure docs and a tentative understanding of the difference between a sequence and a collection and a list. I knew that I needed to write a recursive function but I was not sure exactly what (line-seq r) was spitting out. Happily for me, it turns out that everything does end up working more or less like a list, and so with very little fiddling I was able to get my magnificent file reading function to work.

Behold!

    
(defn parse-rsslist-file [db urlfile]
  (with-open [r (reader urlfile)]
    (parssrss db (line-seq r))))

(defn parssrss [db sq]
  (if (not sq)
    db
    (parssrss (db-add-line db (first sq)) (rest sq))))

(defn db-add-line [db line]
  (let [lspl (.split line " ")]
    (cons (struct-map feed 
                  :name (second lspl)
                  :title nil
                  :url (first lspl)) db)))
    
  

No comments: