Professional Positions

  • Present 2014

    Associate Researcher

    Barcelona Supercomputing Center (BSC)
    Autonomic Systems and e-Business Platforms

  • 2014 2014

    Data Analyst

    Aplicaciones de Inteligencia Artificial, S.A. (AIS)
    Barcelona

  • 2014 2011

    Business Intelligence Analyst

    Accenture Inc. (ACN)
    Barcelona

  • 2010 2010

    Visiting Student

    Georgia Institute of Technology
    Broadband Wireless Networking (BWN) Lab

Education & Training

  • Ph.D.2018

    Computer Architecture

    UPC-Barcelona Tech
    Computer Architecture Department

  • M.Sc.2014

    Master of Science in Information and Communication Technologies

    UPC - BarcelonaTech

  • Eng.2010

    Telecommunitations Enginnering

    UPC - BarcelonaTech

Other Activities

  • Present 2014

    International Conference on Predictive Applications and APIs (PAPIs)

    Submission Chair, Program Comitte Assistant & staff member

  • 2016 2016

    Spark Barcelona Meetup

    Co-organizer

Honors, Awards and Grants

  • 2015
    Accenture - MINT Award
    image
    Accenture - MINT award.

    Best transcripts records of Master of Science in Information and Communication Technologies (MINT)

  • 2015
    Red.es - MINT Award
    image
    Red.es - MINT award.

    Best Master Thesis in the area of Information Systems

  • 2014-2018
    “la Caixa” Foundation Fellowship
    image
    “la Caixa" Fellowship, awarded to outstanding graduate students in Spain to pursue graduate doctoral studies with full funding support. Related news about the fellowship program and some details about the project can be read here in Spanish.
  • 2014
    Innova Challenge MX
    image
    1st Prize in the Big Data contest organized by BBVA in the category of Apps for Companies with the web application PEAR campaigns.

    Team members: Jordi Aranda, Jose Cordero, Jordi Nin, David Solans and myself

  • 2010
    International Mobility Grant
    image
    Bancaja and UPC grant to financially help the International Mobility of UPC students.
  • 2010
    USA Mobility Grant
    image
    Vodafone Foundation grant to support UPC visiting researchers in the USA.
  • 2008
    Telecom engineering Award
    image
    5th best academic transcript of Telecom engineering - ETSETB - BCNTelecom.

Filter by type:

Sort by year:

Scaling DBSCAN-like Algorithms for Event Detection Systems in Twitter

Joan Capdevila, Gonzalo Pericacho, Jordi Torres, Jesús Cerquides
Conference Papers16th International Conference on Algorithms and Architectures for Parallel Processing, December 2016

Abstract

The increasing use of mobile social networks has lately transformed news media. Real-world events are nowadays reported in social networks much faster than in traditional channels. As a result, the autonomous detection of events from networks like Twitter has gained lot of interest in both research and media groups. DBSCAN-like algorithms constitute a well-known clustering approach to retrospective event detection. However, scaling such algorithms to geographically large regions and temporarily long periods present two major shortcomings. First, detecting real-world events from the vast amount of tweets cannot be performed anymore in a single machine. Second, the tweeting activity varies a lot within these broad space-time regions limiting the use of global parameters. Against this background, we propose to scale DBSCAN-like event detection techniques by parallelizing and distributing them through a novel density-aware MapReduce scheme. The proposed scheme partitions tweet data as per its spatial and temporal features and tailors local DBSCAN parameters to local tweet densities. We implement the scheme in Apache Spark and evaluate its performance in a dataset composed of geo-located tweets in the Iberian peninsula during the course of several football matches. The results pointed out to the benefits of our proposal against other state-of-the-art techniques in terms of speed-up and detection accuracy.

Tweet-SCAN: An event discovery technique for geo-located tweets

Joan Capdevila, Jesús Cerquides, Jordi Nin, Jordi Torres
Journal PaperPattern Recognition Letters, Available online 25 August 2016

Abstract

Twitter has become one of the most popular Location-based Social Networks (LBSNs) that bridges physical and virtual worlds. Tweets, 140-character-long messages, are aimed to give answer to the What’s happening? question. Occurrences and events in the real life (such as political protests, music concerts, natural disasters or terrorist acts) are usually reported through geo-located tweets by users on site. Uncovering event-related tweets from the rest is a challenging problem that necessarily requires exploiting different tweet features. With that in mind, we propose Tweet-SCAN, a novel event discovery technique based on the popular density-based clustering algorithm called DBSCAN. Tweet-SCAN takes into account four main features from a tweet, namely content, time, location and user to group together event-related tweets. The proposed technique models textual content through a probabilistic topic model called Hierarchical Dirichlet Process and introduces Jensen–Shannon distance for the task of neighborhood identification in the textual dimension. As a matter of fact, we show Tweet-SCAN performance in two real data sets of geo-located tweets posted during Barcelona local festivities in 2014 and 2015, for which some of the events were identified by domain experts beforehand. Through these tagged data sets, we are able to assess Tweet-SCAN capabilities to discover events, justify using a textual component and highlight the effects of several parameters.

Recognizing warblers: a probabilistic model for event detection in Twitter

Joan Capdevila, Jesús Cerquides, Jordi Torres
Conference Papers Anomaly Detection Workshop at the International Conference on Machine Learning, June 2016

Abstract

Event detection in Twitter poses a set of new challenges since social networks were not specifically designed for this task. The event identification capabilities of existing probabilistic models are far from state-of-the-art. In this paper we identify three key factors which, when combined, boost the accuracy of such models. Firstly, we notice that the large amount of meaningless social data requires modeling non-event observations. Secondly, we note that the tweeting activity varies in space and time. Thirdly, we observe that the shortness of tweets hampers the application of traditional topic models. Consequently, we propose WARBLE, a new probabilistic model and variational learning algorithm for retrospective event detection that explicitly considers all three factors. The preliminary results show that the proposed model outperforms other state-ofthe- art techniques in detecting various types of events while relying on a principled probabilistic framework to reason under uncertainty.

Technical Report: Variational forms and updates for the WARBLE model

Joan Capdevila, Jesús Cerquides, Jordi Torres
Unpublished Technical note

Abstract

This technical note contains the variational forms and updates for the WARBLE model.

GeoSRS: A hybrid social recommender system for geolocated data

Joan Capdevila, Marta Arias, Argimiro Arratia
Journal Paper Information Systems, Volume 57, April 2016, Pages 111–128

Abstract

We present GeoSRS, a hybrid recommender system for a popular location-based social network (LBSN), in which users are able to write short reviews on the places of interest they visit. Using state-of-the-art text mining techniques, our system recommends locations to users using as source the whole set of text reviews in addition to their geographical location. To evaluate our system, we have collected our own data sets by crawling the social network Foursquare. To do this efficiently, we propose the use of a parallel version of the Quadtree technique, which may be applicable to crawling/exploring other spatially distributed sources. Finally, we study the performance of GeoSRS on our collected data set and conclude that by combining sentiment analysis and text modeling, GeoSRS generates more accurate recommendations. The performance of the system improves as more reviews are available, which further motivates the use of large-scale crawling techniques such as the Quadtree.

Tweet-SCAN: An event discovery technique for geo-located tweets

Joan Capdevila, Jesús Cerquides, Jordi Nin, Jordi Torres
Conference PapersProceedings of the 18th International Conference of the Catalan Association for Artificial Intelligence, October 2015

Abstract

Twitter has become one of the most popular Location-Based Social Networks (LBSNs) that enables bridging physical and virtual worlds. Tweets, 140- character-long messages published in Twitter, are aimed to provide basic responses to the What’s happening? question. Occurrences and events in the real life are usually reported through geo-located tweets by users on site. Uncovering event-related tweets from the rest is a challenging problem that necessarily requires exploiting different tweet features.With that in mind, we propose Tweet-SCAN, a novel event discovery technique based on the density-based clustering algorithm called DBSCAN. Tweet-SCAN takes into account four main features from a tweet, namely content, time, location and user to cluster homogeneously event-related tweets. This new technique models textual content through a probabilistic topic model called Hierarchical Dirichlet Process and introduces Jensen-Shannon distance for the task of neighborhood identification in the textual dimension. As a matter of fact, we show Tweet-SCAN performance in a real data set of geo-located tweets posted during Barcelona local festivities in 2014, for which some of the events were known beforehand. By means of this data set, we are able to assess Tweet-SCAN capabilities to discover events, justify using a textual component and highlight the effects of several parameters.

PHLAME: A Physical Layer Aware MAC protocol for Electromagnetic nanonetworks in the Terahertz Band

Josep Miquel Jornet, Joan Capdevila Pujol, Josep Solé Pareta
Journal Paper Nano Communication Networks, Volume 3, Issue 1, March 2012, Pages 74-81

Abstract

Nanonetworks will enable advanced applications of nanotechnology in the biomedical, industrial, environmental and military fields, by allowing integrated nano-devices to communicate and to share information. Due to the expectedly very high density of nano-devices in nanonetworks, novel Medium Access Control (MAC) protocols are needed to regulate the access to the channel and to coordinate concurrent transmissions among nano-devices. In this paper, a new PHysical Layer Aware MAC protocol for Electromagnetic nanonetworks in the Terahertz Band (PHLAME) is presented. This protocol is built on top of a novel pulse-based communication scheme for nanonetworks and exploits the benefits of novel low-weight channel coding schemes. In PHLAME, the transmitting and receiving nano-devices jointly select the optimal communication scheme parameters and the channel coding scheme which maximize the probability of successfully decoding the received information while minimizing the generated multi-user interference. The performance of the protocol is analyzed in terms of energy consumption, delay and achievable throughput, by taking also into account the energy limitations of nano-devices. The results show that PHLAME, by exploiting the properties of the Terahertz Band and being aware of the nano-devices’ limitations, is able to support very densely populated nanonetworks with nano-devices transmitting at tens of Gigabit/second.

PHLAME: A physical layer aware MAC protocol for electromagnetic nanonetworks

J. Capdevila Pujol, J.M. Jornet, J.S. Pareta
Conference Papers Computer Communications Workshops (INFOCOM WKSHPS), 2011 IEEE Conference on Molecular and Nano Scale Communication (MoNaCom), April 2011, Pages 431 - 436

Abstract

Nanotechnology is enabling the development of integrated devices just a few hundred nanometers in size. Communication among these nano-devices will boost the applications of nanotechnology in the biomedical, environmental and military fields. Within the communication alternatives at the nanoscale, the state of the art in nanomaterial research points to the Terahertz band (0.1-10 THz) as the frequency range of operation of graphene-based electromagnetic (EM) nano-transceivers. This frequency band supports very large transmission bit-rates and enables simple communication mechanisms suited to the limited capabilities of nano-devices. Due to an expectedly very large number of nano-devices sharing the same channel, it is necessary to develop new Medium Access Control (MAC) protocols which will be able to capture the peculiarities of nanonetworks in the Terahertz band. In this paper, PHLAME, a physical layer aware MAC protocol for electromagnetic nanonetworks, is introduced. This protocol is built on top of a novel communication scheme based on the exchange of femtosecond-long pulses spread in time, and exploits the benefits of novel low-weight channel coding schemes. In the PHLAME protocol, the transmitting and receiving nano-devices jointly select the communication parameters that minimize the interference in the nanonetwork and maximize the probability of successfully decoding the received information. The performance of the protocol is analyzed in terms of energy consumption, delay and achievable throughput, by taking also into account the energy limitations of nano-devices. The results show that, despite its simplicity, the PHLAME protocol is able to support densely populated nanonetworks by exploiting the peculiarities of the Terahertz band.

At My Office

You can also find me at my office located at Polytechnic University of Catalonia, C\ Jordi Girona 1 -3, Campus Nord in Barcelona. The office is in C6 building, 2nd floor, room 221.

image