"nb" -- New Books Watch requirements: unix shell, cron, standard text tools -------------------------------------------- 'nb' is a Unix script that polls LibraryThing author pages and sends an email alert whenever a new work shows up that previously was not on LibraryThing. It is useful for keeping track of favorite authors who publish new books. The script uses LibraryThing as a backend database but you don't need an account there, it is accessed anonymously. INSTALL and RUN 1. Copy nb.sh to its own directory and edit: change the parameters at the top (self explanatory). NOTE: Be careful of line breaks (use an editor without word-wrap such as vi or 'pico -w'). Check for paths on your system with "which program" Set the script executable: chmod 755 nb.sh 2. Create a list of authors in a file called nb.cfg (or any name - the name is a command line argument). The file takes the form: lastnamefirstname lastnamefirstname etc.. Here's a sample cfg file with two authors: deli notify/nb> cat nb.cfg winchestersimon zolaemile deli notify/nb> These are the LibraryThing ID's as seen in the author page URL. Generally it is simply lastnamefirstname (lowercase no-space) but best to check the author's page on LibraryThing and look at the URL for the author's ID. For example, the Author Page for Emile Zola has a URL of: http://www.librarything.com/author/zolaemile&norefer=1 thus the ID is "zolaemile" (the part after "/author/" and before "&") 3. Run it manually for a test. Example: [/home/bob/nb] ./nb nb.cfg If no email arrives, check for the file "emailbody.txt" - if it's there, the problem is with the email system somewhere. 3. Add to cron. Here is a sample cron entry to run at 11:45pm 45 23 * * * /home/steve/nb/nb.sh /home/steve/nb/nb.cfg >> /dev/null 2>&1 I run mine once a week, it keeps the noise down, new books don't come out that often to need to run it that often. FILES If you're curious how it works.. For each author nb creates three file: a. authorname.new - the latest list of books b. authorname.old - the previous list of books c. authorname.htm - the raw HTML data from LibraryThing nb then compares the difference between .old and .new and emails any works that are in .new but not .old NOTES/BUGS a. Known bug: any book title with quotes (""), the part in quotes won't show up. Thus if the full title is in quotes (rare) it will be invisable. If only part of the title is in quotes, only that part won't show up. b. It only shows additional works, not subtractions (works removed by combination, deletion or canonical title change (see next note)). c. Work titles are determined by the "canonical title" under Common Knowledge. If there is no canonical title, LT uses the most common title. Thus, old works can show up as new if someone adds/changes the canonical title in common knowledge. d. The script logs into LibraryThing anonymous. You don't need a LT account. WARNINGS/ERRORS Warnings and Errors can be turned on/off as one of the customize variables at the top of the script (email_warnings, email_errors) Errors: a. "Bad HTML" - This could be caused by an inaccurate author ID in nb.cfg, or if LibraryThing is down or network unreachable. Recommend to always leave this on. Warnings: b. "Highload" - LibraryThing occasionally has high server load and blocks anonymous users from doing certain resource intensive actions. The script detects this and just skips doing anything. c. "Tempdown" - LibraryThing is reporting it is temporarily down. nb skips doing anything for that run. If there are too many warnings, change the time of day when the script runs to a slow period when LT is not as busy, like early morning. --- "New Books Watch" Stephen Balbach stephen@balbach.net http://bachlab.balbach.net/nb.html