January 09, 2014
Currently the way to find people at UC Berkeley is to use directory.berkeley.edu. This service wasn’t that intuitive for me to use.
Aiming for a clean and intuitive design, it adds the following features:
Node.js for crawling and exposing the API
Firebase querying - Setting up Firebase with AngularJS was pretty straightforward, but querying was somewhat of a pain. There’s currently no way to do database like queries with Firebase.
Speed - Making it fast was of utter importance and the hardest part. In the beginning I loaded the more than 1,000,000 users in the browser. This made it extremely slow and not very scalable. Now it caches it in Node.js and provides 10 results at a time.
Scraping - In the first iteration I scraped the web version of the Berkeley directory at directory.berkeley.edu. This was incredibly slow and took 40+ hours to complete. Now we use LDAP to get the data and takes approximately 1 hour to complete.
ldapsearch -H ldap://ldap.berkeley.edu -x -b 'ou=people,dc=berkeley,dc=edu' objectclass=*
However, connecting to LDAP with ldapjs, a great LDAP library for Node.js, proved to be a bit harder. Thanks to Mark Cavage for explaining that ldapjs search scopes are backwards. In the end, the only change was to set the
sub instead of using the default
As with any project, there are always improvements to be made:
If you would like to make any changes yourself, feel free to make a pull request on christianv/berkeleydir.