Barcelona Supercomputing Center (BSC)
Autonomic Systems and e-Business Platforms
I grew up in Coll de Nargó, a small village in the region of Alt Urgell within the Catalan Pyrenees. I received my M.Sc. and Engineering degree from the electrical and computer engineering school (ETSETB - TelecomBCN) at UPC in Barcelona. I specialized in Signal Processing and Communication systems and I conducted research on the novel field of Nanonetworks at the BWN Lab from Georgia Institue of Technology (GaTech), supervised by Prof. Ian F. Akyildiz and Dr. Josep Miquel Jornet.
After three years in the IT industry working first as Business Intelligence Analyst (Accenture Inc.) and later as Data Analyst (AIS, Aplicaciones de Inteligencia Artificial, S.A.), I am back to the academia to do research in Statistical Machine Learning. In particular, I am interested in text anomaly detection through probabilistic models and Bayesian inference. This work is co-supervised by Prof. Jordi Torres and Dr. Jesús Cerquides and sponsored by "LaCaixa" Foundation.
I am also an organization team member of the International Conference on Predictive Apps and APIs (PAPIs).
Barcelona Supercomputing Center (BSC)
Autonomic Systems and e-Business Platforms
Aplicaciones de Inteligencia Artificial, S.A. (AIS)
Accenture Inc. (ACN)
Georgia Institute of Technology
Broadband Wireless Networking (BWN) Lab
Computer Architecture Department
Master of Science in Information and Communication Technologies
UPC - BarcelonaTech
UPC - BarcelonaTech
Best transcripts records of Master of Science in Information and Communication Technologies (MINT)
Best Master Thesis in the area of Information Systems
Team members: Jordi Aranda, Jose Cordero, Jordi Nin, David Solans and myself
The increasing use of mobile social networks has lately transformed news media. Real-world events are nowadays reported in social networks much faster than in traditional channels. As a result, the autonomous detection of events from networks like Twitter has gained lot of interest in both research and media groups. DBSCAN-like algorithms constitute a well-known clustering approach to retrospective event detection. However, scaling such algorithms to geographically large regions and temporarily long periods present two major shortcomings. First, detecting real-world events from the vast amount of tweets cannot be performed anymore in a single machine. Second, the tweeting activity varies a lot within these broad space-time regions limiting the use of global parameters. Against this background, we propose to scale DBSCAN-like event detection techniques by parallelizing and distributing them through a novel density-aware MapReduce scheme. The proposed scheme partitions tweet data as per its spatial and temporal features and tailors local DBSCAN parameters to local tweet densities. We implement the scheme in Apache Spark and evaluate its performance in a dataset composed of geo-located tweets in the Iberian peninsula during the course of several football matches. The results pointed out to the benefits of our proposal against other state-of-the-art techniques in terms of speed-up and detection accuracy.
Twitter has become one of the most popular Location-based Social Networks (LBSNs) that bridges physical and virtual worlds. Tweets, 140-character-long messages, are aimed to give answer to the What’s happening? question. Occurrences and events in the real life (such as political protests, music concerts, natural disasters or terrorist acts) are usually reported through geo-located tweets by users on site. Uncovering event-related tweets from the rest is a challenging problem that necessarily requires exploiting different tweet features. With that in mind, we propose Tweet-SCAN, a novel event discovery technique based on the popular density-based clustering algorithm called DBSCAN. Tweet-SCAN takes into account four main features from a tweet, namely content, time, location and user to group together event-related tweets. The proposed technique models textual content through a probabilistic topic model called Hierarchical Dirichlet Process and introduces Jensen–Shannon distance for the task of neighborhood identification in the textual dimension. As a matter of fact, we show Tweet-SCAN performance in two real data sets of geo-located tweets posted during Barcelona local festivities in 2014 and 2015, for which some of the events were identified by domain experts beforehand. Through these tagged data sets, we are able to assess Tweet-SCAN capabilities to discover events, justify using a textual component and highlight the effects of several parameters.
Event detection in Twitter poses a set of new challenges since social networks were not specifically designed for this task. The event identification capabilities of existing probabilistic models are far from state-of-the-art. In this paper we identify three key factors which, when combined, boost the accuracy of such models. Firstly, we notice that the large amount of meaningless social data requires modeling non-event observations. Secondly, we note that the tweeting activity varies in space and time. Thirdly, we observe that the shortness of tweets hampers the application of traditional topic models. Consequently, we propose WARBLE, a new probabilistic model and variational learning algorithm for retrospective event detection that explicitly considers all three factors. The preliminary results show that the proposed model outperforms other state-ofthe- art techniques in detecting various types of events while relying on a principled probabilistic framework to reason under uncertainty.
This technical note contains the variational forms and updates for the WARBLE model.
We present GeoSRS, a hybrid recommender system for a popular location-based social network (LBSN), in which users are able to write short reviews on the places of interest they visit. Using state-of-the-art text mining techniques, our system recommends locations to users using as source the whole set of text reviews in addition to their geographical location. To evaluate our system, we have collected our own data sets by crawling the social network Foursquare. To do this efficiently, we propose the use of a parallel version of the Quadtree technique, which may be applicable to crawling/exploring other spatially distributed sources. Finally, we study the performance of GeoSRS on our collected data set and conclude that by combining sentiment analysis and text modeling, GeoSRS generates more accurate recommendations. The performance of the system improves as more reviews are available, which further motivates the use of large-scale crawling techniques such as the Quadtree.
Twitter has become one of the most popular Location-Based Social Networks (LBSNs) that enables bridging physical and virtual worlds. Tweets, 140- character-long messages published in Twitter, are aimed to provide basic responses to the What’s happening? question. Occurrences and events in the real life are usually reported through geo-located tweets by users on site. Uncovering event-related tweets from the rest is a challenging problem that necessarily requires exploiting different tweet features.With that in mind, we propose Tweet-SCAN, a novel event discovery technique based on the density-based clustering algorithm called DBSCAN. Tweet-SCAN takes into account four main features from a tweet, namely content, time, location and user to cluster homogeneously event-related tweets. This new technique models textual content through a probabilistic topic model called Hierarchical Dirichlet Process and introduces Jensen-Shannon distance for the task of neighborhood identification in the textual dimension. As a matter of fact, we show Tweet-SCAN performance in a real data set of geo-located tweets posted during Barcelona local festivities in 2014, for which some of the events were known beforehand. By means of this data set, we are able to assess Tweet-SCAN capabilities to discover events, justify using a textual component and highlight the effects of several parameters.
Nanonetworks will enable advanced applications of nanotechnology in the biomedical, industrial, environmental and military fields, by allowing integrated nano-devices to communicate and to share information. Due to the expectedly very high density of nano-devices in nanonetworks, novel Medium Access Control (MAC) protocols are needed to regulate the access to the channel and to coordinate concurrent transmissions among nano-devices. In this paper, a new PHysical Layer Aware MAC protocol for Electromagnetic nanonetworks in the Terahertz Band (PHLAME) is presented. This protocol is built on top of a novel pulse-based communication scheme for nanonetworks and exploits the benefits of novel low-weight channel coding schemes. In PHLAME, the transmitting and receiving nano-devices jointly select the optimal communication scheme parameters and the channel coding scheme which maximize the probability of successfully decoding the received information while minimizing the generated multi-user interference. The performance of the protocol is analyzed in terms of energy consumption, delay and achievable throughput, by taking also into account the energy limitations of nano-devices. The results show that PHLAME, by exploiting the properties of the Terahertz Band and being aware of the nano-devices’ limitations, is able to support very densely populated nanonetworks with nano-devices transmitting at tens of Gigabit/second.
Nanotechnology is enabling the development of integrated devices just a few hundred nanometers in size. Communication among these nano-devices will boost the applications of nanotechnology in the biomedical, environmental and military fields. Within the communication alternatives at the nanoscale, the state of the art in nanomaterial research points to the Terahertz band (0.1-10 THz) as the frequency range of operation of graphene-based electromagnetic (EM) nano-transceivers. This frequency band supports very large transmission bit-rates and enables simple communication mechanisms suited to the limited capabilities of nano-devices. Due to an expectedly very large number of nano-devices sharing the same channel, it is necessary to develop new Medium Access Control (MAC) protocols which will be able to capture the peculiarities of nanonetworks in the Terahertz band. In this paper, PHLAME, a physical layer aware MAC protocol for electromagnetic nanonetworks, is introduced. This protocol is built on top of a novel communication scheme based on the exchange of femtosecond-long pulses spread in time, and exploits the benefits of novel low-weight channel coding schemes. In the PHLAME protocol, the transmitting and receiving nano-devices jointly select the communication parameters that minimize the interference in the nanonetwork and maximize the probability of successfully decoding the received information. The performance of the protocol is analyzed in terms of energy consumption, delay and achievable throughput, by taking also into account the energy limitations of nano-devices. The results show that, despite its simplicity, the PHLAME protocol is able to support densely populated nanonetworks by exploiting the peculiarities of the Terahertz band.
You can also find me at my office located at Polytechnic University of Catalonia, C\ Jordi Girona 1 -3, Campus Nord in Barcelona. The office is in C6 building, 2nd floor, room 221.