Dheeru Mundluru
dheeru.m at g(as in google)mail dot com


Welcome to my Web home. My name is Dheerendranath Mundluru and most people know me as just Dheeru. I grew up in the beautiful city of Hyderabad, which is in the southern part of India. I am currently working as a Research Scientist at Local.com in Irvine, California. I also recently completed my Ph.D. and my advisor is Dr. Vijay Raghavan. Below are few more details about my work and I can be contacted through the above e-mail address.


Research

My current research interests are in the following areas of Web data mining:

My dissertation was in the area of Web Information Extraction. A major contribution of my dissertation is a highly effective and efficient structured data extraction algorithm called PIE (Path-based Information Extractor). PIE can be used in systems such as Deep Web crawlers, metasearch engines and online comparative shopping engines. If you are interested in knowing about PIE, you can read chapter 2 of my dissertation.

At Local.com, I developed a Deep Web crawler called LocalDeepBot, which can crawl sources such as review sites (e.g., TripAdvisor.com) and store locator services of franchise sources (e.g., Walmart.com). I also developed a robust wrapper induction system called Guitar (short for GUIded exTrActoR). The wrappers (extraction rules) created using Guitar are used by LocalDeepBot for effectively and efficiently crawling Deep Web resources. To get a feel for these two systems, you can read my paper below titled Experiences in Crawling Deep Web in the Context of Local Search. I am also currently developing a very effective and efficient Opinion Mining system. I will give more details about this in the near future.


Education

Ph.D. in Computer Science, University of Louisiana at Lafayette (2008)
M.S in Computer Science, University of Louisiana at Lafayette (ULL) (2003)
B.E in Computer Science and Engineering, University of Madras, India (2000)


Professional Experience

Jul '06 - Present
Research Scientist, Local.com Corporation, Irvine, CA

Jan '04 - May '06
Research Assistant, Laboratory for Internet Computing (LINC), CACS, ULL

Aug '05 - May '06
Teaching Assistant, CACS, ULL
(CMPS 561 - Information Storage and Retrieval in Fall 2005, CMPS 566 - Data Mining in Spring 2006)

Jun '04 - Jul '04
Software Engineer (Summer Intern), Webscalers L.L.C., Lafayette, LA

May '01 - Dec '03
Research Assistant, Center for Business and Information Technologies (CBIT), ULL

May '02 - Aug '02
Software Engineer (Summer Intern), Thought Creek Inc., Campbell, CA


Publications
  1. D. Mundluru. Automatically Constructing Wrappers for Effective and Efficient Web Information Extraction. PhD Thesis, University of Louisiana at Lafayette, 2008. (pdf)

  2. D. Mundluru and X. Xia. Experiences in Crawling Deep Web in the Context of Local Search. In Proceedings of the Fifth Workshop on Geographic Information Retrieval, Napa Valley, 2008. (pdf)

  3. D. Mundluru, J. Katukuri, and S. Celebi. Automatically Mining Result Records from Search Engine Response Pages. In Proceedings of the Fifth IEEE International Conference on Data Mining, pages 749-752, Houston, 2005. (pdf | Power Point) -- The algorithm in my dissertation is much more advanced version of the algorithm proposed in this paper. If interested, I strongly recommend reading the latest algorithm from my dissertation.

  4. D. Mundluru, Z. Wu, V. Raghavan, W. Meng, and H. Zhao. Automatically Extracting Subsequent Response Pages from Web Search Sources. In Proceedings of IEEE Workshop on Knowledge Acquisition from Distributed, Autonomous, Semantically Heterogeneous Data and Knowledge Sources, Houston, 2005. (pdf | Power Point)

  5. D. Mundluru, Z. Wu, V. Raghavan, J. Katukuri, and S. Celebi. Automatically Mining Search Result Records. Technical Report CACS-TR-2005-3-1, Center for Advanced Computer Studies, University of Louisiana at Lafayette, 2005. (pdf) -- The algorithm in my dissertation is much more advanced version of the algorithm proposed in this paper. If interested, I strongly recommend reading the latest algorithm from my dissertation.

  6. Y. Xie, D. Mundluru, and V. Raghavan. Incorporating Agent Based Neural Network Model for Adaptive Meta-Search. In Proceedings of the Forty Third ACM Southeast Conference, pages 53-58, Kennesaw, 2005. (pdf)

  7. Z. Wu, D. Mundluru, and V. Raghavan. Automatically Detecting Boolean Operations Supported by Search Engines, towards Search Engine Query Language Discovery. In Proceedings of The Second International Workshop on Web-based Support Systems, pages 171-178, Beijing, China, 2004. (pdf | Power Point)

Selected Past Projects

Access Louisiana Business - I led this project while I was working at CBIT as a research assistant. This project fetched CBIT the coveted Lantern 2003 award for economic contribution from the Louisiana Governor's office. An article, including our team's photo, recognizing the contribution of this project was published in Spring 2003's La Louisiane magazine.

Mobile Information Manager (MIM) - I worked on this project while I was working as a summer intern at wAppearances in 2002. MIM is a Web-based wireless software application for venue/event management. I designed and developed most of this system.

QoS Routing Using Bandwidth Reservation - I worked on this project in Fall 2001 & Spring 2002. In this project, I implemented a scalable solution that achieves good routing performance with reduced processing cost in QoS routing.


Presentations

Guest lectures in Information Retrieval class in Fall 05:

      (i) Web content mining: Deep Web crawling

      (ii) Web Content Mining: Extracting Structured Data from the Web

Automatically Detecting Query Language Features of a Search Engine

Web Services - Introduces Web Services, ebXML (another specification similar to Web Services), some of the open problems in the area of Web Services, two architectures which focus on accessing Web Services from Wireless devices, and finally Semantic Web.

Java 2 Micro Edition - A very detail introduction to J2ME and its architecture.

Continuous Integration + MVC paradigm - Talks about the importance of testing and how testing should be done in real time projects. Also gives a good introduction to the MVC paradigm.

Microsoft Passport vs Liberty Alliance project - Compares the architectures of Microsoft Passport and the Libert Alliance Project. Also gives a very good introduction to JAAS (Java Authentication and Authorization Service).

Security - Gives a good introduction of Secure Socket Layer (SSL) and shows how security should be managed by businesses. This presentation also covers Servlet security, Firewall architectures and finally Database security.


Bookmarks

Web Data Mining

People working on Information Extraction

University of Illinois at Chicago

Universidade da Coruņa

Binghamton University

University College Dublin

CMU


Miscellaneous

Success Soul

No end in sight for Africa's suffering masses. By Jeff Koinange

As we may think. By Vennevar Bush. The Atlantic Monthly, July 1945.

Java World

Servlets.com

The Server Side

Quotes

Digg - A social bookmarking site.


Last Modified On: September 16th, 2008