Source author record

Xun Wang

Xun Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

22works

20topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Images in Sentences: Scaling Interleaved Instructions for Unified Visual Generation

While recent advancements in multimodal language models have enabled image generation from expressive multi-image instructions, existing methods struggle to maintain performance under complex interleaved instructions. This limitation stems from the structural separation of images and text in current paradigms, which forces models to bridge difficult long-range dependencies to match descriptions with visual targets. To address these challenges, we propose \texttt{I}mages i\texttt{N} \texttt{SE}n\texttt{T}ences (\textit{a.k.a}, INSET), a unified generation model that seamlessly embeds images as native vocabulary within textual instructions. By positioning visual features directly at their corresponding semantic slots, INSET leverages the contextual locality of transformers for precise object binding, effectively treating images as dense, expressive language tokens. Furthermore, we introduce a scalable data engine that synthesizes 15M high-quality interleaved samples from standard image and video datasets, utilizing VLMs and LLMs to construct rich, long-horizon sequences. Evaluation results on InterleaveBench demonstrate that INSET significantly outperforms state-of-the-art methods in multi-image consistency and text alignment, with performance gaps widening as input complexity increases. Beyond standard generation, our approach inherently extends to multimodal image editing, integrating visual content as part of the instruction to facilitate highly expressive and creative visual manipulations.

preprint2022arXiv

Cross-Lingual Cross-Modal Retrieval with Noise-Robust Learning

Despite the recent developments in the field of cross-modal retrieval, there has been less research focusing on low-resource languages due to the lack of manually annotated datasets. In this paper, we propose a noise-robust cross-lingual cross-modal retrieval method for low-resource languages. To this end, we use Machine Translation (MT) to construct pseudo-parallel sentence pairs for low-resource languages. However, as MT is not perfect, it tends to introduce noise during translation, rendering textual embeddings corrupted and thereby compromising the retrieval performance. To alleviate this, we introduce a multi-view self-distillation method to learn noise-robust target-language representations, which employs a cross-attention module to generate soft pseudo-targets to provide direct supervision from the similarity-based view and feature-based view. Besides, inspired by the back-translation in unsupervised MT, we minimize the semantic discrepancies between origin sentences and back-translated sentences to further improve the noise robustness of the textual encoder. Extensive experiments are conducted on three video-text and image-text cross-modal retrieval benchmarks across different languages, and the results demonstrate that our method significantly improves the overall performance without using extra human-labeled data. In addition, equipped with a pre-trained visual encoder from a recent vision-and-language pre-training framework, i.e., CLIP, our model achieves a significant performance gain, showing that our method is compatible with popular pre-training models. Code and data are available at https://github.com/HuiGuanLab/nrccr.

preprint2022arXiv

Exploiting full Resolution Feature Context for Liver Tumor and Vessel Segmentation via Integrate Framework: Application to Liver Tumor and Vessel 3D Reconstruction under embedded microprocessor

Liver cancer is one of the most common malignant diseases in the world. Segmentation and labeling of liver tumors and blood vessels in CT images can provide convenience for doctors in liver tumor diagnosis and surgical intervention. In the past decades, many state-of-the-art medical image segmentation algorithms appeared during this period. With the development of embedded devices, embedded deployment for medical segmentation and automatic reconstruction brings prospects for future automated surgical tasks. Yet, most of the existing segmentation methods mostly care about the spatial feature context and have a perception defect in the semantic relevance of medical images, which significantly affects the segmentation accuracy of liver tumors and blood vessels. Deploying large and complex models into embedded devices requires a reasonable trade-off between model accuracy, reasoning speed and model capacity. Given these problems, we introduce a multi-scale feature fusion network called TransFusionNet based on Transformer. This network achieved very competitive performance for liver vessel and liver tumor segmentation tasks, meanwhile it can improve the recognition of morphologic margins of liver tumors by exploiting the global information of CT images. Experiments show that in vessel segmentation task TransFusionNet achieved mean Dice coefficients of 0.899 and in liver tumor segmentation task TransFusionNet achieved mean Dice coefficients of 0.961. Compared with the state-of-the-art framework, our model achieves the best segmentation result. In addition, we deployed the model into an embedded micro-structure and constructed an integrated model for liver tumor vascular segmentation and reconstruction. This proprietary structure will be the exclusive component of the future medical field.

preprint2022arXiv

Modality-Balanced Embedding for Video Retrieval

Video search has become the main routine for users to discover videos relevant to a text query on large short-video sharing platforms. During training a query-video bi-encoder model using online search logs, we identify a modality bias phenomenon that the video encoder almost entirely relies on text matching, neglecting other modalities of the videos such as vision, audio. This modality imbalanceresults from a) modality gap: the relevance between a query and a video text is much easier to learn as the query is also a piece of text, with the same modality as the video text; b) data bias: most training samples can be solved solely by text matching. Here we share our practices to improve the first retrieval stage including our solution for the modality imbalance issue. We propose MBVR (short for Modality Balanced Video Retrieval) with two key components: manually generated modality-shuffled (MS) samples and a dynamic margin (DM) based on visual relevance. They can encourage the video encoder to pay balanced attentions to each modality. Through extensive experiments on a real world dataset, we show empirically that our method is both effective and efficient in solving modality bias problem. We have also deployed our MBVR in a large video platform and observed statistically significant boost over a highly optimized baseline in an A/B test and manual GSB evaluations.

preprint2022arXiv

Progressive Localization Networks for Language-based Moment Localization

This paper targets the task of language-based video moment localization. The language-based setting of this task allows for an open set of target activities, resulting in a large variation of the temporal lengths of video moments. Most existing methods prefer to first sample sufficient candidate moments with various temporal lengths, and then match them with the given query to determine the target moment. However, candidate moments generated with a fixed temporal granularity may be suboptimal to handle the large variation in moment lengths. To this end, we propose a novel multi-stage Progressive Localization Network (PLN) which progressively localizes the target moment in a coarse-to-fine manner. Specifically, each stage of PLN has a localization branch, and focuses on candidate moments that are generated with a specific temporal granularity. The temporal granularities of candidate moments are different across the stages. Moreover, we devise a conditional feature manipulation module and an upsampling connection to bridge the multiple localization branches. In this fashion, the later stages are able to absorb the previously learned information, thus facilitating the more fine-grained localization. Extensive experiments on three public datasets demonstrate the effectiveness of our proposed PLN for language-based moment localization, especially for localizing short moments in long videos.

preprint2022arXiv

Reading-strategy Inspired Visual Representation Learning for Text-to-Video Retrieval

This paper aims for the task of text-to-video retrieval, where given a query in the form of a natural-language sentence, it is asked to retrieve videos which are semantically relevant to the given query, from a great number of unlabeled videos. The success of this task depends on cross-modal representation learning that projects both videos and sentences into common spaces for semantic similarity computation. In this work, we concentrate on video representation learning, an essential component for text-to-video retrieval. Inspired by the reading strategy of humans, we propose a Reading-strategy Inspired Visual Representation Learning (RIVRL) to represent videos, which consists of two branches: a previewing branch and an intensive-reading branch. The previewing branch is designed to briefly capture the overview information of videos, while the intensive-reading branch is designed to obtain more in-depth information. Moreover, the intensive-reading branch is aware of the video overview captured by the previewing branch. Such holistic information is found to be useful for the intensive-reading branch to extract more fine-grained features. Extensive experiments on three datasets are conducted, where our model RIVRL achieves a new state-of-the-art on TGIF and VATEX. Moreover, on MSR-VTT, our model using two video features shows comparable performance to the state-of-the-art using seven video features and even outperforms models pre-trained on the large-scale HowTo100M dataset.

preprint2022arXiv

Smart Contract Vulnerability Detection Technique: A Survey

Smart contract, one of the most successful applications of blockchain, is taking the world by storm, playing an essential role in the blockchain ecosystem. However, frequent smart contract security incidents not only result in tremendous economic losses but also destroy the blockchain-based credit system. The security and reliability of smart contracts thus gain extensive attention from researchers worldwide. In this survey, we first summarize the common types and typical cases of smart contract vulnerabilities from three levels, i.e., Solidity code layer, EVM execution layer, and Block dependency layer. Further, we review the research progress of smart contract vulnerability detection and classify existing counterparts into five categories, i.e., formal verification, symbolic execution, fuzzing detection, intermediate representation, and deep learning. Empirically, we take 300 real-world smart contracts deployed on Ethereum as the test samples and compare the representative methods in terms of accuracy, F1-Score, and average detection time. Finally, we discuss the challenges in the field of smart contract vulnerability detection and combine with the deep learning technology to look forward to future research directions.

preprint2021arXiv

Dual Encoding for Video Retrieval by Text

This paper attacks the challenging problem of video retrieval by text. In such a retrieval paradigm, an end user searches for unlabeled videos by ad-hoc queries described exclusively in the form of a natural-language sentence, with no visual example provided. Given videos as sequences of frames and queries as sequences of words, an effective sequence-to-sequence cross-modal matching is crucial. To that end, the two modalities need to be first encoded into real-valued vectors and then projected into a common space. In this paper we achieve this by proposing a dual deep encoding network that encodes videos and queries into powerful dense representations of their own. Our novelty is two-fold. First, different from prior art that resorts to a specific single-level encoder, the proposed network performs multi-level encoding that represents the rich content of both modalities in a coarse-to-fine fashion. Second, different from a conventional common space learning algorithm which is either concept based or latent space based, we introduce hybrid space learning which combines the high performance of the latent space and the good interpretability of the concept space. Dual encoding is conceptually simple, practically effective and end-to-end trained with hybrid space learning. Extensive experiments on four challenging video datasets show the viability of the new method.

preprint2020arXiv

Anisotropy links cell shapes to tissue flow during convergent extension

Within developing embryos, tissues flow and reorganize dramatically on timescales as short as minutes. This includes epithelial tissues, which often narrow and elongate in convergent extension movements due to anisotropies in external forces or in internal cell-generated forces. However, the mechanisms that allow or prevent tissue reorganization, especially in the presence of strongly anisotropic forces, remain unclear. We study this question in the converging and extending Drosophila germband epithelium, which displays planar polarized myosin II and experiences anisotropic forces from neighboring tissues, and we show that in contrast to isotropic tissues, cell shape alone is not sufficient to predict the onset of rapid cell rearrangement. From theoretical considerations and vertex model simulations, we predict that in anisotropic tissues two experimentally accessible metrics of cell patterns, the cell shape index and a cell alignment index, are required to determine whether an anisotropic tissue is in a solid-like or fluid-like state. We show that changes in cell shape and alignment over time in the Drosophila germband predict the onset of rapid cell rearrangement in both wild-type and snail twist mutant embryos, where our theoretical prediction is further improved when we also account for cell packing disorder. These findings suggest that convergent extension is associated with a transition to more fluid-like tissue behavior, which may help accommodate tissue shape changes during rapid developmental events.

preprint2020arXiv

Channel Interaction Networks for Fine-Grained Image Categorization

Fine-grained image categorization is challenging due to the subtle inter-class differences.We posit that exploiting the rich relationships between channels can help capture such differences since different channels correspond to different semantics. In this paper, we propose a channel interaction network (CIN), which models the channel-wise interplay both within an image and across images. For a single image, a self-channel interaction (SCI) module is proposed to explore channel-wise correlation within the image. This allows the model to learn the complementary features from the correlated channels, yielding stronger fine-grained features. Furthermore, given an image pair, we introduce a contrastive channel interaction (CCI) module to model the cross-sample channel interaction with a metric learning framework, allowing the CIN to distinguish the subtle visual differences between images. Our model can be trained efficiently in an end-to-end fashion without the need of multi-stage training and testing. Finally, comprehensive experiments are conducted on three publicly available benchmarks, where the proposed method consistently outperforms the state-of-theart approaches, such as DFL-CNN (Wang, Morariu, and Davis 2018) and NTS (Yang et al. 2018).

preprint2020arXiv

Cross-Batch Memory for Embedding Learning

Mining informative negative instances are of central importance to deep metric learning (DML), however this task is intrinsically limited by mini-batch training, where only a mini-batch of instances is accessible at each iteration. In this paper, we identify a "slow drift" phenomena by observing that the embedding features drift exceptionally slow even as the model parameters are updating throughout the training process. This suggests that the features of instances computed at preceding iterations can be used to considerably approximate their features extracted by the current model. We propose a cross-batch memory (XBM) mechanism that memorizes the embeddings of past iterations, allowing the model to collect sufficient hard negative pairs across multiple mini-batches - even over the whole dataset. Our XBM can be directly integrated into a general pair-based DML framework, where the XBM augmented DML can boost performance considerably. In particular, without bells and whistles, a simple contrastive loss with our XBM can have large R@1 improvements of 12%-22.5% on three large-scale image retrieval datasets, surpassing the most sophisticated state-of-the-art methods, by a large margin. Our XBM is conceptually simple, easy to implement - using several lines of codes, and is memory efficient - with a negligible 0.2 GB extra GPU memory. Code is available at: https://github.com/MalongTech/research-xbm.

preprint2020arXiv

Feature Re-Learning with Data Augmentation for Video Relevance Prediction

Predicting the relevance between two given videos with respect to their visual content is a key component for content-based video recommendation and retrieval. Thanks to the increasing availability of pre-trained image and video convolutional neural network models, deep visual features are widely used for video content representation. However, as how two videos are relevant is task-dependent, such off-the-shelf features are not always optimal for all tasks. Moreover, due to varied concerns including copyright, privacy and security, one might have access to only pre-computed video features rather than original videos. We propose in this paper feature re-learning for improving video relevance prediction, with no need of revisiting the original video content. In particular, re-learning is realized by projecting a given deep feature into a new space by an affine transformation. We optimize the re-learning process by a novel negative-enhanced triplet ranking loss. In order to generate more training data, we propose a new data augmentation strategy which works directly on frame-level and video-level features. Extensive experiments in the context of the Hulu Content-based Video Relevance Prediction Challenge 2018 justify the effectiveness of the proposed method and its state-of-the-art performance for content-based video relevance prediction.

preprint2020arXiv

Multi-Similarity Loss with General Pair Weighting for Deep Metric Learning

A family of loss functions built on pair-based computation have been proposed in the literature which provide a myriad of solutions for deep metric learning. In this paper, we provide a general weighting framework for understanding recent pair-based loss functions. Our contributions are three-fold: (1) we establish a General Pair Weighting (GPW) framework, which casts the sampling problem of deep metric learning into a unified view of pair weighting through gradient analysis, providing a powerful tool for understanding recent pair-based loss functions; (2) we show that with GPW, various existing pair-based methods can be compared and discussed comprehensively, with clear differences and key limitations identified; (3) we propose a new loss called multi-similarity loss (MS loss) under the GPW, which is implemented in two iterative steps (i.e., mining and weighting). This allows it to fully consider three similarities for pair weighting, providing a more principled approach for collecting and weighting informative pairs. Finally, the proposed MS loss obtains new state-of-the-art performance on four image retrieval benchmarks, where it outperforms the most recent approaches, such as ABE\cite{Kim_2018_ECCV} and HTL by a large margin: 60.6% to 65.7% on CUB200, and 80.9% to 88.0% on In-Shop Clothes Retrieval dataset at Recall@1. Code is available at https://github.com/MalongTech/research-ms-loss.

preprint2020arXiv

Tree-Augmented Cross-Modal Encoding for Complex-Query Video Retrieval

The rapid growth of user-generated videos on the Internet has intensified the need for text-based video retrieval systems. Traditional methods mainly favor the concept-based paradigm on retrieval with simple queries, which are usually ineffective for complex queries that carry far more complex semantics. Recently, embedding-based paradigm has emerged as a popular approach. It aims to map the queries and videos into a shared embedding space where semantically-similar texts and videos are much closer to each other. Despite its simplicity, it forgoes the exploitation of the syntactic structure of text queries, making it suboptimal to model the complex queries. To facilitate video retrieval with complex queries, we propose a Tree-augmented Cross-modal Encoding method by jointly learning the linguistic structure of queries and the temporal representation of videos. Specifically, given a complex user query, we first recursively compose a latent semantic tree to structurally describe the text query. We then design a tree-augmented query encoder to derive structure-aware query representation and a temporal attentive video encoder to model the temporal characteristics of videos. Finally, both the query and videos are mapped into a joint embedding space for matching and ranking. In this approach, we have a better understanding and modeling of the complex queries, thereby achieving a better video retrieval performance. Extensive experiments on large scale video retrieval benchmark datasets demonstrate the effectiveness of our approach.

preprint2018arXiv

Electronic nature of coverage-dependent nanosurface effect by cooperative orbital redistribution

Nanomaterial surface states can effectively modify or even dominate their physical and chemical properties due to large surface-to-volume ratios. Such surface effects are highly dependent on particle size and ligand coverage, yet the underlying electronic-level mechanism still remains unknown. Using TiO2 nanosheet as a model system, we reveal the electronic nature of coverage-dependent nanosurface effects through varying ligand coverage and probing the modified surface bonding and electronic band structures with near-edge X-ray absorption fine structure. We discover experimentally that surface ligands can competitively polarize the 3d orbitals of surface Ti atoms into chemisorption states, which is cooperative with increased ligand coverages. Such coverage-dependent cooperative orbital redistribution accounts for various nanosurface effects on regulating the electronic structure, surface reactivity, optical property, and chemisorption of nanomaterials.

preprint2016arXiv

PyRIDE: An Interactive Development Environment for PR2 Robot

Python based Robot Interactive Development Environment (PyRIDE) is a software that supports rapid \textit{interactive} programming of robot skills and behaviours on PR2/ROS (Robot Operating System) platform. One of the key features of PyRIDE is its interactive remotely accessible Python console that allows its users to program robots \textit{online} and in \textit{realtime} in the same way as using the standard Python interactive interpreter. It allows programs to be modified while they are running. PyRIDE is also a software integration framework that abstracts and aggregates disparate low level ROS software modules, e.g. arm joint motor controllers, and exposes their functionalities through a unified Python programming interface. PR2 programmers are able to experiment and develop robot behaviours without dealing with specific details of accessing underlying softwares and hardwares. PyRIDE provides a client-server mechanism that allows remote user access of the robot functionalities, e.g. remote robot monitoring and control, access real-time robot camera image data etc. This enables multi-modal human robot interactions using different devices and user interfaces. All these features are seamlessly integrated into one lightweight and portable middleware package. In this paper, we use four real life scenarios to demonstrate PyRIDE key features and illustrate the usefulness of software.

preprint2016arXiv

The extended Kerr-Schild approach to general relativity

We study in some detail the "extended Kerr-Schild" formulation of general relativity, which decomposes the gauge-independent degrees of freedom of a generic metric into two arbitrary functions and the choice of a flat background tetrad. We recast Einstein's equations and spacetime curvatures in the extended Kerr-Schild form and discuss their properties, illustrated with simple examples.

preprint2015arXiv

Constraints on force-free magnetospheres for Kerr(-AdS) black holes with non-null currents

Force-free magnetospheres are of particular interest due to their role in energy extraction from Kerr black holes via the Blandford-Znajek process. Recently, a class of exact analytic solutions has been found with null currents [1,2]. In this paper, we elaborate some constraints on various force-free magnetosphere solutions with non-null currents, utilizing the Newman-Penrose electromagnetic scalars to categorize a range of different cases. We perform a thorough search for stationary and axisymmetric (SAS) solutions, and find that putative SAS solutions within the categories considered generically exhibit singularities on the horizon. We also present some non-SAS solutions found via spacetime-dependent electric-magnetic duality rotations. Additional special solutions in flat, pure AdS and near-horizon-extreme-Kerr (NHEK) spacetimes are also presented.

preprint2014arXiv

Kerr-AdS Black Holes and Force-Free Magnetospheres

We obtain analogs of the Blandford-Znajek split monopole solution for force-free magnetospheres around a slowly rotating Kerr-AdS black hole. For small black holes, we find an analytic solution to first order in the ratio of horizon radius to AdS scale, $r_H/l$, which exhibits a radial Poynting flux and for $r_H/l \rightarrow 0$ smoothly approaches the Blandford-Znajek configuration in an asymptotically flat Kerr background. However, for large Kerr-AdS black holes with $r_H/l > 1$, namely those for which the bulk black hole holographically describes the thermodynamics of a strongly-interacting boundary field theory, the existence of a globally well-defined timelike Killing vector external to the horizon suggests the absence of energy extraction through the Blandford-Znajek process. In this regime, we find that at least for slow rotation the force-free solution still exists but exhibits a range of angular velocities for the field lines, corresponding to the freedom in the dual field theory to rotate a magnetic field through a neutral plasma. As a byproduct of this work, we also obtain an analytic solution for a rotating monopole magnetosphere in pure AdS, analogous to the Michel solution in flat space.

preprint2014arXiv

Modeling Word Relatedness in Latent Dirichlet Allocation

Standard LDA model suffers the problem that the topic assignment of each word is independent and word correlation hence is neglected. To address this problem, in this paper, we propose a model called Word Related Latent Dirichlet Allocation (WR-LDA) by incorporating word correlation into LDA topic models. This leads to new capabilities that standard LDA model does not have such as estimating infrequently occurring words or multi-language topic modeling. Experimental results demonstrate the effectiveness of our model compared with standard LDA.

preprint2014arXiv

What a Nasty day: Exploring Mood-Weather Relationship from Twitter

While it has long been believed in psychology that weather somehow influences human's mood, the debates have been going on for decades about how they are correlated. In this paper, we try to study this long-lasting topic by harnessing a new source of data compared from traditional psychological researches: Twitter. We analyze 2 years' twitter data collected by twitter API which amounts to $10\%$ of all postings and try to reveal the correlations between multiple dimensional structure of human mood with meteorological effects. Some of our findings confirm existing hypotheses, while others contradict them. We are hopeful that our approach, along with the new data source, can shed on the long-going debates on weather-mood correlation.

preprint2010arXiv

Behavioral Simulations in MapReduce

In many scientific domains, researchers are turning to large-scale behavioral simulations to better understand important real-world phenomena. While there has been a great deal of work on simulation tools from the high-performance computing community, behavioral simulations remain challenging to program and automatically scale in parallel environments. In this paper we present BRACE (Big Red Agent-based Computation Engine), which extends the MapReduce framework to process these simulations efficiently across a cluster. We can leverage spatial locality to treat behavioral simulations as iterated spatial joins and greatly reduce the communication between nodes. In our experiments we achieve nearly linear scale-up on several realistic simulations. Though processing behavioral simulations in parallel as iterated spatial joins can be very efficient, it can be much simpler for the domain scientists to program the behavior of a single agent. Furthermore, many simulations include a considerable amount of complex computation and message passing between agents, which makes it important to optimize the performance of a single node and the communication across nodes. To address both of these challenges, BRACE includes a high-level language called BRASIL (the Big Red Agent SImulation Language). BRASIL has object oriented features for programming simulations, but can be compiled to a data-flow representation for automatic parallelization and optimization. We show that by using various optimization techniques, we can achieve both scalability and single-node performance similar to that of a hand-coded simulation.

Xun Wang

What is connected

Connect this record

See the researcher in context

Building this map preview

22 published item(s)

Images in Sentences: Scaling Interleaved Instructions for Unified Visual Generation

Cross-Lingual Cross-Modal Retrieval with Noise-Robust Learning

Exploiting full Resolution Feature Context for Liver Tumor and Vessel Segmentation via Integrate Framework: Application to Liver Tumor and Vessel 3D Reconstruction under embedded microprocessor

Modality-Balanced Embedding for Video Retrieval

Progressive Localization Networks for Language-based Moment Localization

Reading-strategy Inspired Visual Representation Learning for Text-to-Video Retrieval

Smart Contract Vulnerability Detection Technique: A Survey

Dual Encoding for Video Retrieval by Text

Anisotropy links cell shapes to tissue flow during convergent extension

Channel Interaction Networks for Fine-Grained Image Categorization

Cross-Batch Memory for Embedding Learning

Feature Re-Learning with Data Augmentation for Video Relevance Prediction

Multi-Similarity Loss with General Pair Weighting for Deep Metric Learning

Tree-Augmented Cross-Modal Encoding for Complex-Query Video Retrieval

Electronic nature of coverage-dependent nanosurface effect by cooperative orbital redistribution

PyRIDE: An Interactive Development Environment for PR2 Robot

The extended Kerr-Schild approach to general relativity

Constraints on force-free magnetospheres for Kerr(-AdS) black holes with non-null currents

Kerr-AdS Black Holes and Force-Free Magnetospheres

Modeling Word Relatedness in Latent Dirichlet Allocation

What a Nasty day: Exploring Mood-Weather Relationship from Twitter

Behavioral Simulations in MapReduce