Hack&Roll 2019

Before leaving for Singapore, I was looking up some events that would be happening in the city in the first few weeks I would be here. I came across a hackathon, which piqued my interest. To my surprise (and delight), it was to be held at NUS where I would be studying. I registered for the event (Hack and Roll 2019), an easy process as I could claim student-ship at the hosting university. I wasn’t completely set on doing it but wanted to keep my options open for Things-To-Do upon arrival.

Unfortunately, none of the other exchangers had seen the event (though many were interested in participating when I told them about it) but it was too late to sign up. Because teammates can be – and generally are – the defining aspect of experience at a hackathon, I hesitated to work with people I didn’t already know. The hackathon organizers made a Slack channel for people to post about themselves and their skills, so I decided to go ahead see if I could find people with whom I would work well. I didn’t find anyone at first but eventually teamed up with two students studying computer science at NUS: one fourth-year student and the other a second-year.

We ended up making a web app that provides a set of job recommendations based on the user’s resume. The advantage of JobMatcher (as we so creatively called it) was that it centralizes job postings from multiple sources – our prototype just included only listings from Indeed. The main technical challenge was coming up with a sorting algorithm that somehow prioritizes the collected jobs. Say, for example, the scraping module gathered 100 jobs from searches based on keywords from the user’s resume; how are we to present the jobs to the user? what makes a job “most relevant” to a user? Although we could have spent the whole hackathon thinking about this issue, we just decided to sort according to the number of matched keywords between the search results and the user resume. This solution is pretty much as brute-force and crude as you can come up with but actually resulted in jobs (at least, for us) that were quite relevant and was easy to implement.

We had originally wanted the user to provide access to their LinkedIn account as input (rather than a resume). LinkedIn provides much more structured data than resumes, making it easier to process. I actually spent the majority of my time working with the LinkedIn API to implement authentication. Although I learned quite a bit (mostly about OAuth authentication), we didn’t actually use LinkedIn as the source of user career information because of privacy issues. In order to gain full access to a user’s account, we would have had to have been registered and verified as LinkedIn developers. Because we weren’t, we only had access to the user’s name, email, and tag line … clearly not enough career info to make predictions on which jobs listings would fit them.

Here’s the project on Devpost and the repo on GitHub.

I logged some of my thoughts throughout the hackathon, which are listed below. I didn’t start writing down notes until we were a bit into the process so the picture is not quite complete but it’s still representative of the ups and downs we experienced in developing the app.


16.32 First task after deciding what we actually want to work on- looking up how to use GCloud with a backend Flask framework. We’ve discussed using Google App Engine to deploy our app (rather than Heroku).

18.00 after about 2 hours debugging – GCloud Talent Solution is in fact not what we need. It’s for a job portal, for example, where you can add your own jobs and then search for them. Big waste of time.

19.27 Post-dinner. We decided to continue with the project despite encountering significant setbacks. I am starting work on extracting key info from LinkedIn profiles. I first create an app on their website, and receive an application ID and secret key with which I can make requests on behalf of the user.

22.19 I just got the profile authentication and information in JSON form. Now, a user can log in on our page. Subsequently, we will be able to retrieve information from their profile. We plan to use their top 3 skills listed on their LinkedIn profile.

23.30 We just realized: LinkedIn doesn’t allow full profile access to non-registered developers. It’s a matter of privacy – they don’t allow third parties to access the complete profile.

02.03 We are trying to get around not having access to the profile. There are a few “alternative” solutions – like a webcrawler that could go to the profile and search through the HTML for relevant information (past job titles, skills)

02.57 The webcrawlers were to no avail, so we decided to just use a PDF text extractor to get information from a user’s resume. Surprisingly, there isn’t an obvious choice for a Python package to convert a PDF to text. PyPDF, for example, just returns a file containing large amounts of whitespace. We aren’t able to install other libraries, like PDFtoText that should make life easy.

03.36 Even after more searching and more debugging and help from a Hackathon organizer, we weren’t able to find a python module to extract text. We are now looking into front-end modules written in JS that would serve the same purpose.

04.45 I ended up going back to the Python modules with an ugly implementation because I feel like we don’t have any other choice.
Got the text extractor to work, and am able to write the text to a file. We are now looking at how to extract relevant information from   the text file. Most importantly, we should be able to get keywords in the resume that would be useful in a “keyword search” in a job portal. The Python module Rake, a “domain-independent keyword extractor,” doesn’t do a great job getting relevant words. For one, it doesn’t recognize the difference between standard ASCII characters and special characters in the PDF. For example, a crude implementation returns “2015 interests snowboarding • triathlons • piano • backpacking • ultimate frisbee” as the most relevant phrase in my resume. Don’t think that’ll work.

05.08 The main problem that we are encountering is unstructured, non-standard data from the resume.

06.54 I’m running out of steam. We got the keyword search working, we just have to pretty-fy everything and come up with a sorting algorithm to a really rank the results.

I noticed the birds are chirping, so I walked outside to get a little closer. The sun’s almost out, I think I might use this early morning opportunity to see the sunrise.

The hackathon venue is right next to the tower block, so I decided to watch the sunrise from up top. I swiped in with my student card for the general area, and then headed for an elevator to take me up the 25 flights of stairs. I put my key card in front of the access point and tried to push one of the floors, but nothing. Denied. I guess I will have to work for the sunrise, climbing the stairs up to the top. About halfway up, I stopped to make SURE the elevator doesn’t work. I came inside and started pressing the buttons for the top floors, but they didn’t stay lit up. So I leaned down and started pushing buttons somewhat hoping the machine would malfunction and I’d catch a ride up. Then I noticed the doors were closed, and the “open door” button wouldn’t work without a valid key card. So yaaaaa I’m currently trapped in the elevator debating what I can do.

Update I called the Mysterious Elevator Man From Above with the push of a button and he said ground floor was my only option so I’m back to squ… floor one. Debating if I’ll make it up.

Update I made it. Nice view but not much of a sunrise – I can only see out the west end of the tower.

9:00 Took a wee nap. It was nice but I don’t feel too great. Aadit and Pankaj are working furiously on the UI of the web app, trying to display the info nicely with React.

12:00 We got everything looking semi-presentable and have practiced our project pitch to a few other groups. Now we wait to present to the judges.

Mods or Modules or Courses or Classes or

Courses are referred to as “modules” at NUS. Most students call them “mods” in conversation. And, as is the case for the start of any semester, mods have been the focal point of many conversations over the past couple of weeks. I encountered a fair amount of trouble when I was first applying for acceptance to the university; NUS is known (well, at least in the Rice study abroad office) for being difficult to get into classes due to limited spots. In November, I submitted my top eight choices for courses… and got into one. After looking around for more course options and talking to professors at Rice to ensure they would transfer, I found five others that would work. I got into two of those, putting me at the minimum three modules required for acceptance. They weren’t the courses I wanted, but I was told there is quite a bit of mod movement during the add/drop period.

Courses are generally in two- to three-hour blocks and meet just once or twice a week, as opposed to twice or thrice like at most American universities. Thus far, I have found that the longer blocks help me engage material more. It’s also good in that there seems like there are fewer conflicts because the courses meet less often. That is, you could fit 7-8 mods into your schedule (some NUS students unfortunately do) whereas you’d be hard-pressed to coordinate that many courses if they met thrice a week. This system does, however, make it easy to forget about a course for a few days since there can be 5-7 days in between lectures. The classes are blocked “on the hour” but usually let out about 30 minutes before the nominal end time such that students can make it to the next class. So, for example, my 10-12 Networks lecture ends by ~11:35, giving me plenty of time to make it to my 12-2 Microelectronics lecture.

EL3211 Language in Contact

This module introduces students to the phenomenon of language contact. We will explore sociolinguistic conditions of language contact, and how these conditions lead to contact-induced linguistic change. The study of contact languages is a study of how new forms of language emerge from contact ecologies. The main focus of the module is on the linguistic properties of contact languages, such as Chinese Pidgin English and Singapore Colloquial English, and on the theoretical issues of language emergence.

This is a third-year course in Linguistics. Of all my mods, I am most excited about this one. Part of the reason I chose to study in Asia/Singapore is that I wanted a different perspective from the Western-centric style in which Rice and other universities in the U.S. and Europe teach Linguistics. My Rice professors admit that, up until very recently, the Linguistics community was primarily Western-born, speaking mostly Indo-European languages.

The professor for the course asked us to call her by her first name: Mie.  She’s an older Japanese woman, cute and personable. She cares about her students but expects a lot from them. I wasn’t quite sure what to think of her or the course when she played loud heavy metal music on the first day as we were walking in. Evidently, that’s the usual for her. We start off every lecture with a dose of heavy death metal. During our short break (she gives us about 10 minutes to rest as the lecture is two and a half hours long), she goes on YouTube and puts on pranks for us to watch – something a little like this.

Something interesting about the Linguistics courses at NUS – they are listed and instructed under the “English Language” department. This was surprising in that Linguistics deals with cross-linguistic principles. It doesn’t make much sense to have the study of all languages (Linguistics) be a subset of the study of one language (English). Haven’t figured out why that is.

LAC1201 Chinese Mandarin

This is a beginners’ module consisting of three main components: conversation, grammar and Chinese characters learning. Vocabulary items, sentence patterns and short texts will be taught. Students will acquire basic communicative skills to deal with simple daily situations after reading this module. Approximately 180 Chinese characters and 150 phrases will be introduced.

This will be my first exposure to Mandarin. I don’t think the course will transfer back for any meaningful credit at Rice, but I’ve been looking forward to learning Chinese for a while.

I have a few frustrations related to this course, but I’m optimistic that it’ll get better. The first two lectures were spent repeating the sounds of initials and finals over and over and over and over and over. The professor had us repeat every initial-final combination in all four tones multiple times. Of course, it’s important to learn pronunciation. But I think our time would be better spent on words and cultural context. I would rather practice pronunciation on my own by listening to the audio that comes with our textbook.

Mandarin Chinese Initials and Finals

This is the first language I’ve tried to learn after becoming familiar with Linguistic principles. I am disappointed in how little linguistics will help me in this course. I asked a question that was mildly linguistic after lecture, but the professor wasn’t familiar with the idea. He said he was trained as a journalist and wouldn’t be able to answer my linguistics-based questions.  In any case, it wouldn’t make sense to teach from a Linguistics perspective because most students aren’t familiar with the vocabulary. Two different people I’ve reached out to in the department have implied (and one explicitly stated) that they believe Linguistics is largely unhelpful in their pedagogical approach to Mandarin.

It’s not that Linguistics isn’t helpful in my learning at all – it’s just not to the extent that I expected it to be. My phonetics and phonology training helps to be familiar with the sound patterns in Mandarin. Having the linguistic vocabulary is also advantageous when searching a question on the Internet.

EE4210 Network Protocols and Applications

This advanced networking module aims to equip students with the basics and theories of Internet-related technologies, which are necessary for computer/network engineers. The topics that will be covered include Internet architecture, Internet applications and their protocols (HTTP, FTP, DNS, Email, P2P, BitTorrent, etc.), wireless and mobile networks, mobility management, multimedia networking, and network security.

EE3431C Microelectronics Materials and Devices

Electronic devices are the basic building blocks of all electronic gadgets used in our daily life. A solid understanding of the fundamental device concepts is essential for the electrical engineer to keep up with the fast evolution of new device technology. This module emphasizes on the properties of electronic materials and the operation principles of key electronic devices including p-n diode, bipolar junction transistor (BJT), MOS capacitor and (MOSCAP). Additional issues related to dielectric materials and non- semiconductor materials will be introduced. Contacts between metal and semiconductor will also be covered.

 

The Singaporean Zoo

I went with a few other exchangers, who convinced me to go with the claim that it’s one of the best zoos in the world. Indeed, their website claims it’s the “World’s Best Rainforest Zoo.”

It was enjoyable, but we didn’t really find it to be world-class. Our high expectations were let down at first but we got to interact with some animals by the end of the day. Then again, a good zoo may not be the most enjoyable zoo for a visitor if the keepers are looking out for the best interest of the animals.

I would like to go back for the River Safari and Night Safari because they are more unique to the Singaporean zoo. I’ve heard they are unlike any typical zoo experience.

Some of my favorite pictures below.

 

SG First Impressions

Clean (wow very clean). Trees (lots and lots of them, lining all the roadways and in buildings and on top of buildings). Sleek (glass and shiny metal). Modern (young nation, yeah, but I didn’t know it would look this cool). Green (more trees, more grass). Expensive (this seems a little too good to be true – they must have to pay for all this nice stuff?). Efficient roadways (no honking, new cars, smart drivers). Methodist church, Buddhist temple, Hindu temple (all so close to each other).

One of the many buildings covered in green. It’s the Park Royal Hotel near Clarke Quay. I took this one on a walk there recently.

Continue reading “SG First Impressions”

Heading for Singapore: Expectations

I wrote down some of my thoughts before I left, highlighting what I was most excited for.

  • As I was packing for Singapore, I put on BBC’s Planet Earth and watched it on repeat. It’s one of the few TV shows I’ve actually watched all the way through. In the last episode of the series, BBC gives a shout-out to SG for it’s Green philosophy. It’s known as the Garden City, and has plans to become a City Within a Garden. I can’t wait to go on runs in one of the many parks and not be suffocated as I was last summer in Bangalore.
Super Trees at Gardens by the Bay, Singapore. Image: BBC’s Planet Earth II: Cities
  • I’m curious about how the fairly well-defined cultures interact in Singapore. The main ethnic groups are Indian (mostly southern Indians), Malay, and Chinese. How prominent are these cultures in everyday life in SG? How does language relate to these ethnic groups – if someone identifies as part of one of these groups, will they speak their mother tongue?

Continue reading “Heading for Singapore: Expectations”