IEEE Transactions on Pattern Analysis and Machine Intelligence

IEEE Transactions on Pattern Analysis and Machine Intelligence

IEEE Transactions on Pattern Analysis and Machine Intelligence (sometimes abbreviated as IEEE PAMI or simply PAMI) is a monthly peer-reviewed scientific journal published by the IEEE Computer Society. == Background == The journal covers research in computer vision and image understanding, pattern analysis and recognition, machine intelligence, machine learning, search techniques, document and handwriting analysis, medical image analysis, video and image sequence analysis, content-based retrieval of image and video, and face and gesture recognition. The editor-in-chief is Kyoung Mu Lee (Seoul National University). According to the Journal Citation Reports, the journal has a 2023 impact factor of 20.8.

MeituPic

Meitu Xiu Xiu ("Meitu") (Chinese: 美图秀秀) is an image editing software that is mostly used in Mainland China but is also popular in Hong Kong and Taiwan. It is only available on Google Play and App Store in certain countries. It provides tools for editing photos: filters, retouching, collage, scenes, frames, and photo decorations, as well as generative AI features such as text-to-images, AI removal and AI repainting etc. Meitu is one of the apps developed by Meitu, Inc.; it also produced BeautyCam, Wink and X-Design. == History == Meitu's PC version was created in 2008 by Wu Xinhong, the CEO of Meitu. In 2013, its mobile version became one of the first must-have mobile apps in China. Meitu, Inc. is a photo and video-centered app developer, which was founded in 2008 in Xiamen. Currently, the major revenue source of Meitu is premium subscription. Meitu, Inc. was initially funded by Cai Wensheng, a well-known angel investor. The company has an approximately 250 million monthly active users globally. == Function == === Edit === MeituPic provides a number of photo-editing tools. The major functions are auto enhance, edit, enhance, filters, frames, magic brush, mosaic, text, and blur. Auto enhance focuses on the nature of photos taken, while Edit includes functions of cropping, rotation, sharpening, and adjustment of ratio. For Enhance, users can apply slight adjustment on the photo by controlling the levels of brightness, contrast, colour temperature, saturation, highlight, shadow and smart light. Major types of filters are LOMO, beauty, style as well as art. Different frames can be chosen from poster, simple, and fantasy. Magic brush provides a great variety of brushes with different colours and patterns for users to decorate the photos. Mosaic brush enables users to cover certain parts of the photo. Texts can be added to the photo. Choices of different bubbles, font as well as style of words are available. Blurring effect is also available to make the photo less distinct and clear. === Beauty Retouch === There are seven major functions for retouching a photo: automatic retouch, smooth and whiten skin, remove blemish, make slimmer, remove dark circles and bags under the eyes, make taller, and enhance the eyes. Automatic retouch enhances portraits by lightening the skin tone, brightening the eyes, and simulating a face-lift by tapping on just one button. This helps to remove wrinkles and optimizes the skin tone. Acne, blemishes, and other skin imperfections can also be removed. The face-lift and weight-loss functions in the slimming option can be used to reshape the body. The option to make the subject taller can be used to change the perceived height of the subject and give the impression of slimmer, longer legs. The option to enhance the eyes can enlarge and brighten the eyes. === Collage === Collage has four types: template, freestyle, poster, PicStrip, which all maximize to insert nine photos. Template integrates photos in a vertical rectangle tightly. MeituPic has 15 frames or free download function for users. MeituPic also provides different templates according to number of photos inserted. Freestyle separates photos on a background freely. There are two parts of background: custom and more. For custom, users choose from album. For more, there are plain and picture with 18 choices. Poster makes a poster with photos. Users choose a poster among 8 choices or tap ‘more’ to download a new one. PicStrip combines photos vertically making an elongated file. Users choose a frame from 15 choices. Pinching thumb and forefinger together or apart zooms photos in/out. Putting two fingers and turning hand rotates photos. Pressing moves photos to ideal location. After designing, users tap ‘save/share’ on the upper right corner and the photo made is saved into album automatically. == Awards ==

Lion algorithm

Lion algorithm (LA) is one among the bio-inspired (or) nature-inspired optimization algorithms (or) that are mainly based on meta-heuristic principles. It was first introduced by B. R. Rajakumar in 2012 in the name, Lion’s Algorithm. It was further extended in 2014 to solve the system identification problem. This version was referred as LA, which has been applied by many researchers for their optimization problems. == Inspiration from lion’s social behaviour == Lions form a social system called a "pride", which consists of 1–3 pair of lions. A pride of lions shares a common area known as territory in which a dominant lion is called as territorial lion. The territorial lion safeguards its territory from outside attackers, especially nomadic lions. This process is called territorial defense. It protects the cubs till they become sexually matured. The maturity period is about 2–4 years. The pride undergoes survival fights to protect its territory and the cubs from nomadic lions. Upon getting defeated by the nomadic lions, the dominating nomadic lion takes the role of territorial lion by killing or driving out the cubs of the pride. The lioness of the pride give birth to cubs though the new territorial lion. When the cubs of the pride mature and considered to be stronger than the territorial lion, they take over the pride. This process is called territorial take-over. If territorial take-over happens, either the old territorial lion, which is considered to be laggard, is driven out or it leaves the pride. The stronger lions and lioness form the new pride and give birth to their own cubs == Terminology == In the LA, the terms that are associated with lion’s social system are mapped to the terminology of optimization problems. Few of such notable terms are related here. Lion: A potential solution to be generated or determined as optimal (or) near-optimal solution of the problem. The lion can be a territorial lion and lioness, cubs and nomadic lions that represent the solution based on the processing steps of the LA. Territorial lion: The strongest solution of the pride that tends to meet the objective function. Nomadic lion: A random solution, sometimes termed as nomad, to facilitate the exploration principle Laggard lion: Poor solutions that are failed in the survival fight. Pride: A pool of potential solutions i.e. a lion, lioness and their cubs, that are potential solutions of the search problem. Fertility evaluation: A process of evaluating whether the territorial lion and lioness are able to provide potential solutions in the future generations i.e. It ensures that the lion or lioness converge at every generation. Survival fight: It is a greedy selection process, which is often carried out between the pride and nomadic lion. == Algorithm == The steps involved in LA are given below: Pride Generation: Generate X m a l e {\displaystyle X^{male}} , X f e m a l e {\displaystyle X^{female}} and X 1 n o m a d {\displaystyle X_{1}^{nomad}} Determine f ( X m a l e ) {\displaystyle f(X^{male})} , f ( X f e m a l e ) {\displaystyle f(X^{female})} , f ( X 1 n o m a d ) {\displaystyle f(X_{1}^{nomad})} Initialize f r e f {\displaystyle f^{ref}} as f ( X m a l e ) {\displaystyle f(X^{male})} and N g {\displaystyle N_{g}} as 0 Memorize X m a l e {\displaystyle X^{male}} and X f e m a l e {\displaystyle X^{female}} Apply Fertility evaluation Process Generation of cubpool by mating Gender clustering: Define X c u b m a l e {\displaystyle X_{cub}^{male}} and X c u b f e m a l e {\displaystyle X_{cub}^{female}} Initialize a g e c u b {\displaystyle age_{cub}} as zero Apply Cub growth function Territorial defense: If X m a l e {\displaystyle X^{male}} (or pride) fails in the survival fight i.e. X 1 n o m a d {\displaystyle X_{1}^{nomad}} defeats the pride, go to step 4, else continue Increase a g e c u b {\displaystyle age_{cub}} by 1 and check whether cub attains maturity i.e., if a g e c u b > a g e m a x {\displaystyle age_{cub}>age_{max}} , go to Step 9, else continue Territorial takeover: If X c u b m a l e {\displaystyle X_{cub}^{male}} and X c u b f e m a l e {\displaystyle X_{cub}^{female}} are found to be closer to optimal solution, update X m a l e {\displaystyle X^{male}} and X f e m a l e {\displaystyle X^{female}} Increment N g {\displaystyle N_{g}} by 1 Repeat from Step 5, if termination criterion is not violated, else return X m a l e {\displaystyle X^{male}} as the near-optimal solution == Variants == The LA has been further taken forward to adopt in different problem areas. According to the characteristics of the problem area, significant amendment has been done in the processes and the models used in the LA. Accordingly, diverse variants have been developed by the researchers. They can be broadly grouped as hybrid LAs and non-hybrid LAs. Hybrid LAs are the LAs that are amended by the principle of other meta-heuristics, whereas the Non-hybrid LAs take any scientific amendment inside its operation that are felt to be essential to attend the respective problem area. == Applications == LA is applied in diverse engineering applications that range from network security, text mining, image processing, electrical systems, data mining and many more. Few of the notable applications are discussed here. Networking applications: In WSN, LA is used to solve the cluster head selection problem by determining optimal cluster head. Route discovery problem in both the VANET and MANET are also addressed by the LA in the literature. It is also used to detect attacks in advanced networking scenarios such as Software-Defined Networks (SDN) Power Systems: LA has attended generation rescheduling problem in a deregulated environment, optimal localization and sizing of FACTS devices for power quality enhancement and load-frequency controlling problem Cloud computing: LA is used in optimal container-resource allocation problem in cloud environment and cloud security

Algorithmic paradigm

An algorithmic paradigm or algorithm design paradigm is a generic model or framework which underlies the design of a class of algorithms. An algorithmic paradigm is an abstraction higher than the notion of an algorithm, just as an algorithm is an abstraction higher than a computer program. == List of well-known paradigms == === General === Backtracking Branch and bound Brute-force search Divide and conquer Dynamic programming Greedy algorithm Recursion Prune and search === Parameterized complexity === Kernelization Iterative compression === Computational geometry === Sweep line algorithms Rotating calipers Randomized incremental construction

WCF Data Services

WCF Data Services (formerly ADO.NET Data Services, codename "Astoria") is a platform for what Microsoft calls Data Services. It is actually a combination of the runtime and a web service through which the services are exposed. It also includes the Data Services Toolkit which lets Astoria Data Services be created from within ASP.NET itself. The Astoria project was announced at MIX 2007, and the first developer preview was made available on April 30, 2007. The first CTP was made available as a part of the ASP.NET 3.5 Extensions Preview. The final version was released as part of Service Pack 1 of the .NET Framework 3.5 on August 11, 2008. The name change from ADO.NET Data Services to WCF data Services was announced at the 2009 PDC. == Overview == WCF Data Services exposes data, represented as Entity Data Model (EDM) objects, via web services accessed over HTTP. The data can be addressed using a REST-like URI. The data service, when accessed via the HTTP GET method with such a URI, will return the data. The web service can be configured to return the data in either plain XML, JSON or RDF+XML. In the initial release, formats like RSS and ATOM are not supported, though they may be in the future. In addition, using other HTTP methods like PUT, POST or DELETE, the data can be updated as well. POST can be used to create new entities, PUT for updating an entity, and DELETE for deleting an entity. == Description == Windows Communication Foundation (WCF) comes to the rescue when we find ourselves not able to achieve what we want to achieve using web services, i.e., other protocols support and even duplex communication. With WCF, we can define our service once and then configure it in such a way that it can be used via HTTP, TCP, IPC, and even Message Queues. We can consume Web Services using server side scripts (ASP.NET), JavaScript Object Notations (JSON), and even REST (Representational State Transfer). Understanding the basics When we say that a WCF service can be used to communicate using different protocols and from different kinds of applications, we will need to understand how we can achieve this. If we want to use a WCF service from an application, then we have three major questions: 1.Where is the WCF service located from a client's perspective? 2.How can a client access the service, i.e., protocols and message formats? 3.What is the functionality that a service is providing to the clients? Once we have the answer to these three questions, then creating and consuming the WCF service will be a lot easier for us. The WCF service has the concept of endpoints. A WCF service provides endpoints which client applications can use to communicate with the WCF service. The answer to these above questions is what is known as the ABC of WCF services and in fact are the main components of a WCF service. So let's tackle each question one by one. Address: Like a webservice, a WCF service also provides a URI which can be used by clients to get to the WCF service. This URI is called as the Address of the WCF service. This will solve the first problem of "where to locate the WCF service?" for us. Binding: Once we are able to locate the WCF service, one should think about how to communicate with the service (protocol wise). The binding is what defines how the WCF service handles the communication. It could also define other communication parameters like message encoding, etc. This will solve the second problem of "how to communicate with the WCF service?" for us. Contract: Now the only question one is left with is about the functionalities that a WCF service provides. The contract is what defines the public data and interfaces that WCF service provides to the clients. The URIs representing the data will contain the physical location of the service, as well as the service name. It will also need to specify an EDM Entity-Set or a specific entity instance, as in respectively http://dataserver/service.svc/MusicCollection or http://dataserver/service.svc/MusicCollection[SomeArtist] The former will list all entities in the Collection set whereas the latter will list only for the entity which is indexed by SomeArtist. The URIs can also specify a traversal of a relationship in the Entity Data Model. For example, http://dataserver/service.svc/MusicCollection[SomeSong]/Genre traverses the relationship Genre (in SQL parlance, joins with the Genre table) and retrieves all instances of Genre that are associated with the entity SomeSong. Simple predicates can also be specified in the URI, like http://dataserver/service.svc/MusicCollection[SomeArtist]/ReleaseDate[Year eq 2006] will fetch the items that are indexed by SomeArtist and had their release in 2006. Filtering and partition information can also be encoded in the URL as http://dataserver/service.svc/MusicCollection?$orderby=ReleaseDate&$skip=100&$top=50 Although the presence of skip and top keywords indicates paging support, in Data Services version 1 there is no method of determining the number of records available and thus impossible to determine how many pages there may be. The OData 2.0 spec adds support for the $count path segment (to return just a count of entities) and $inlineCount (to retrieve a page worth of entities and a total count without a separate round-trip....).

Multimodal representation learning

Multimodal representation learning is a subfield of representation learning focused on integrating and interpreting information from different modalities, such as text, images, audio, or video, by projecting them into a shared latent space. This allows for semantically similar content across modalities to be mapped to nearby points within that space, facilitating a unified understanding of diverse data types. By automatically learning meaningful features from each modality and capturing their inter-modal relationships, multimodal representation learning enables a unified representation that enhances performance in cross-media analysis tasks such as video classification, event detection, and sentiment analysis. It also supports cross-modal retrieval and translation, including image captioning, video description, and text-to-image synthesis. == Motivation == The primary motivations for multimodal representation learning arise from the inherent nature of real-world data and the limitations of unimodal approaches. Since multimodal data offers complementary and supplementary information about an object or event from different perspectives, it is more informative than relying on a single modality. A key motivation is to narrow the heterogeneity gap that exists between different modalities by projecting their features into a shared semantic subspace. This allows semantically similar content across modalities to be represented by similar vectors, facilitating the understanding of relationships and correlations between them. Multimodal representation learning aims to leverage the unique information provided by each modality to achieve a more comprehensive and accurate understanding of concepts. These unified representations are crucial for improving performance in various cross-media analysis tasks such as video classification, event detection, and sentiment analysis. They also enable cross-modal retrieval, allowing users to search and retrieve content across different modalities. Additionally, it facilitates cross-modal translation, where information can be converted from one modality to another, as seen in applications like image captioning and text-to-image synthesis. The abundance of ubiquitous multimodal data in real-world applications, including understudied areas like healthcare, finance, and human-computer interaction (HCI), further motivates the development of effective multimodal representation learning techniques. == Approaches and methods == === Canonical-correlation analysis based methods === Canonical-correlation analysis (CCA) was first introduced in 1936 by Harold Hotelling and is a fundamental approach for multimodal learning. CCA aims to find linear relationships between two sets of variables. Given two data matrices X ∈ R n × p {\displaystyle X\in \mathbb {R} ^{n\times p}} and Y ∈ R n × q {\displaystyle Y\in \mathbb {R} ^{n\times q}} representing different modalities, CCA finds projection vectors w x ∈ R p {\displaystyle w_{x}\in \mathbb {R} ^{p}} and w y ∈ R q {\displaystyle w_{y}\in \mathbb {R} ^{q}} that maximizes the correlation between the projected variables: ρ = max w x , w y w x ⊤ Σ x y w y w x ⊤ Σ x x w x w y ⊤ Σ y y w y {\displaystyle \rho =\max _{w_{x},w_{y}}{\frac {w_{x}^{\top }\Sigma _{xy}w_{y}}{{\sqrt {w_{x}^{\top }\Sigma _{xx}w_{x}}}{\sqrt {w_{y}^{\top }\Sigma _{yy}w_{y}}}}}} such that Σ x x {\displaystyle \Sigma _{xx}} and Σ y y {\displaystyle \Sigma _{yy}} are the within-modality covariance matrices, and Σ x y {\displaystyle \Sigma _{xy}} is the between-modality covariance matrix. However, standard CCA is limited by its linearity, which led to the development of nonlinear extensions, such as kernel CCA and deep CCA. ==== Kernel CCA ==== Kernel canonical correlation analysis (KCCA) extends traditional CCA to capture nonlinear relationships between modalities by implicitly mapping the data into high dimensional feature spaces using kernel functions. Given kernel functions K x {\displaystyle K_{x}} and K y {\displaystyle K_{y}} with corresponding Gram matrices K x ∈ R n × n {\displaystyle K_{x}\in \mathbb {R} ^{n\times n}} and K y ∈ R n × n {\displaystyle K_{y}\in \mathbb {R} ^{n\times n}} , KCCA seeks coefficients α {\displaystyle \alpha } and β {\displaystyle \beta } that maximize: ρ = max α , β α ⊤ K x K y β α ⊤ K x 2 α β ⊤ K y 2 β {\displaystyle \rho =\max _{\alpha ,\beta }{\frac {\alpha ^{\top }K_{x}Ky\beta }{{\sqrt {\alpha ^{\top }K_{x}^{2}\alpha }}{\sqrt {\beta ^{\top }K_{y}^{2}\beta }}}}} To prevent overfitting, regularization terms are typically added, resulting in: ρ = max α , β α T K x K y β α T ( K x 2 + λ x K x ) α β T ( K y 2 + λ y K y ) β {\displaystyle \rho =\max _{\alpha ,\beta }{\frac {\alpha ^{T}K_{x}K_{y}\beta }{{\sqrt {\alpha ^{T}\left(K_{x}^{2}+\lambda _{x}K_{x}\right)\alpha }}{\sqrt {\;\beta ^{T}\left(K_{y}^{2}+\lambda _{y}K_{y}\right)\beta }}}}} where λ x {\displaystyle \lambda _{x}} and λ y {\displaystyle \lambda _{y}} are regularization parameters. KCCA has proven effective for tasks such as cross-modal retrieval and semantic analysis, though it faces computational challenges with large datasets due to its O ( n 2 ) {\displaystyle O(n^{2})} memory requirement for sorting kernel matrices. KCCA was proposed independently by several researchers. ==== Deep CCA ==== Deep canonical correlation analysis (DCCA), introduced in 2013, employs neural networks to learn nonlinear transformations for maximizing the correlation between modalities. DCCA uses separate neural networks f x {\displaystyle f_{x}} and f y {\displaystyle f_{y}} for each modality to transform the original data before applying CCA: max W x , W y , θ x , θ y corr ⁡ ( f x ( X ; θ x ) , f y ( Y ; θ y ) ) {\displaystyle \max _{W_{x},W_{y},\theta _{x},\theta _{y}}\operatorname {corr} \left(f_{x}(X;\theta _{x}),f_{y}(Y;\theta _{y})\right)} where θ x {\displaystyle \theta _{x}} and θ y {\displaystyle \theta _{y}} represent the parameters of the neural networks, and W x {\displaystyle W_{x}} and W y {\displaystyle W_{y}} are the CCA projection matrices. The correlation objective is computed as: corr ⁡ ( H x , H y ) = tr ⁡ ( T − 1 / 2 H x T H y S − 1 / 2 ) {\displaystyle \operatorname {corr} (H_{x},H_{y})=\operatorname {tr} \left(T^{-1/2}H_{x}^{T}H_{y}S^{-1/2}\right)} where H x = f x ( X ) {\displaystyle H_{x}=f_{x}(X)} and H y = f y ( Y ) {\displaystyle H_{y}=f_{y}(Y)} are the network outputs, T = H x T H x + r x I {\displaystyle T=H_{x}^{T}H_{x}+r_{x}I} , S = H y T H y + r y I {\displaystyle S=H_{y}^{T}H_{y}+r_{y}I} and r x , r y {\displaystyle r_{x},r_{y}} are the regularization parameters. DCCA overcomes the limitations of linear CCA and kernel CCA by learning complex nonlinear relationships while maintaining computational efficiency for large datasets through mini-batch optimization. === Graph-based methods === Graph-based approaches for multimodal representation learning leverage graph structure to model relationships between entities across different modalities. These methods typically represent each modality as a graph and then learn embedding that preserve cross-modal similarities, enabling more effective joint representation of heterogeneous data. One such method is cross-modal graph neural networks (CMGNNs) that extend traditional graph neural networks (GNNs) to handle data from multiple modalities by constructing graphs that capture both intra-modal and inter-modal relationships. These networks model interactions across modalities by representing them as nodes and their relationships as edges. Other graph-based methods include Probabilistic Graphical Models (PGMs) such as deep belief networks (DBN) and deep Boltzmann machines (DBM). These models can learn a joint representation across modalities, for instance, a multimodal DBN achieves this by adding a shared restricted Boltzmann Machine (RBM) hidden layer on top of modality-specific DBNs. Additionally, the structure of data in some domains like Human-Computer Interaction (HCI), such as the view hierarchy of app screens, can potentially be modeled using graph-like structures. The field of graph representation learning is also relevant, with ongoing progress in developing evaluation benchmarks. === Diffusion maps === Another set of methods relevant to multimodal representation learning are based on diffusion maps and their extensions to handle multiple modalities. ==== Multi-view diffusion maps ==== Multi-view diffusion maps address the challenge of achieving multi-view dimensionality reduction by effectively utilizing the availability of multiple views to extract a coherent low-dimensional representation of the data. The core idea is to exploit both the intrinsic relations within each view and the mutual relations between the different views, defining a cross-view model where a random walk process implicitly hops between objects in different views. A multi-view kernel matrix is constructed by combining these relations, defining a cross-view diffusion process and associ

Small data

Small data is data that is 'small' enough for human comprehension. It is data in a volume and format that makes it accessible, informative and actionable. The term "big data" is about machines and "small data" is about people. This is to say that eyewitness observations or five pieces of related data could be small data. Small data is what we used to think of as data. The only way to comprehend Big data is to reduce the data into small, visually-appealing objects representing various aspects of large data sets (such as histogram, charts, and scatter plots). Big Data is all about finding correlations, but Small Data is all about finding the causation, the reason why. A formal definition of small data has been proposed by Allen Bonde, former vice-president of Innovation at Actuate - now part of OpenText: "Small data connects people with timely, meaningful insights (derived from big data and/or “local” sources), organized and packaged – often visually – to be accessible, understandable, and actionable for everyday tasks." Another definition of small data is: The small set of specific attributes produced by the Internet of Things. These are typically a small set of sensor data such as temperature, wind speed, vibration and status. It was estimated (2016) that “If one takes the top 100 biggest innovations of our time, perhaps around 60% to 65% percent are really based on Small Data.” as Martin Lindstrom puts it. Small data includes everything from Snapchat to simple objects such as the post-it note. Lindstrom believes we become so focused on Big-Data that we tend to forget about more basic concepts and creativity. Lindstrom defines Small Data "as seemingly insignificant observations you identify in consumers’ homes, is everything from how you place your shoes on how you hang your paintings". He thus considers that one should perfectly master the basic (Small Data) in order to mine and find correlations. == Academic Recognition and Methodology == The growing significance of "small data" as a distinct field of inquiry was highlighted by the 2024 Thematic Einstein Semester (TES) on Small Data Analysis, hosted by the Berlin Mathematics Research Center MATH+. A central focus of this semester was the transition from theoretical analysis to practical decision-making. Because small data sets are primarily used to drive specific actions, the presentation of results becomes an essential methodological step. The semester’s findings emphasized that while small data may lack volume, it often contains a high density of "many possible interpretations." Consequently, the final conference of the TES was structured around the pillars of interpretation, explanation, and knowledge gain. Participants sought to develop new mathematical and methodical representations that could accurately depict this wealth of interpretative possibilities. This work underscores that analyzing small data is not purely a computational task; it requires a robust interface between mathematics and diverse disciplines to ensure that insights are both contextually grounded and scientifically rigorous. == Uses in business == === Marketing === Bonde has written about the topic for Forbes, Direct Marketing News, CMO.com and other publications. According to Martin Lindstrom, in his book, Small Data: "{In customer research, small data is} Seemingly insignificant behavioural observations containing very specific attributes pointing towards an unmet customer need. Small data is the foundation for breakthrough ideas or completely new ways to turnaround brands." His approach is based on the combination of the observation of small samples with intuition. Marketers can obtain market insights from gathering Small Data by engaging with and observing people in their own environments. In comparison to Big Data, Small Data has the power to trigger emotions and to provide insights into the reasons behind the behaviours of customers. It may uncover detailed information on a person's extroversion or introversion, self-confidence, whether one is having problems in his/her relationship, etc. According to Lindstrom, relationships among people and customer segments are organized around four criteria: Climate: It reveals for example how a person's environment affects their diet. Rulership: The power or government in charge Religion: The prevalence of religion in a country, depending on its influence, indicates whether a person's decision making process is impacted by their belief system. Tradition: Cultural norms influence people's behaviors and interactions. Many companies underestimate the power of Small Data, using samples of millions of consumers instead of recognizing the value of closely observing small samples in their market research. In his book, Lindstrom defines "7Cs", which companies should consider in the attempt to derive meaningful customer insights and market trends through small data from their customers: Collecting: Understanding the manner in which observations are translated inside a home. Clues: Uncovering other distinctive emotional reflections that can be observed. Connecting: Identifying the consequences of emotional behaviour. Causation: Understanding what emotions are being evoked. Correlation: Identifying the initial date of appearance of the behaviour or emotion. Compensation: Identifying the unmet or unfulfilled desire. Concept: Defining the “big idea” compensation for the identified consumer need. Some of Lindstrom's clients such as Lowes Foods looked at data in a different way and actually chose to live with the customer. “As you enter their store, they have now created an amazing community where every staff member acts in a character mood, based on Small Data”. The supermarket made everything it can to make the customer feel at home. All the behaviours of employees are inspired by customer feedbacks gathered from interviews directly done at customer’s home. === Healthcare === Researchers at Cornell University started developing applications to monitor health problems in patients, based on small data. This is an initiative of Cornell's Small Data Lab, in close cooperation with Weill Cornell Medicine College, led by Deborah Estrin. The Small Data Lab developed a series of apps, focusing not only on gathering data from patients' pain but also tracking habits in areas such as grocery shopping. In the case of patients with rheumatoid arthritis for example, which has flares and remissions that do not follow a particular cycle, the app gathers information passively, thus allowing to forecast when a flare might be coming up based on small changes in behaviour. Other apps developed also include monitoring online grocery shopping, to use this information from every user to adapt their groceries to the recommendations of nutritionists, or monitoring email language to identify patterns that might indicate "fluctuations in cognitive performance, fatigue, side effects of medication or poor sleep, and other conditions and treatments that are typically self-reported and self-medicated". === Postal Service === The United States Postal Service (USPS) used optical character recognition (OCR) to automatically read and process 98% of all hand-addressed mail and 99.5% of machine-printed mail. By combining this technology with its small data sample of US zip codes, the USPS can now process more than 36,000 pieces of mail per hour. === Aerospace === In 2015, Boeing established the analytics lab for aerospace data in cooperation with the Carnegie Mellon University to leverage the university's leadership in machine learning, language technologies and data analytics. One of the initiatives projects aims to by standardize maintenance logs using AI to dramatically reduce costs. Currently, there is no standardized procedure to document maintenance logs leading to small but highly unstructured data sets. As a result, it becomes highly difficult for maintenance workers to translate these variations in maintenance logs within a short period of time. However, with AI and a narrow data set of common aircraft maintenance terminology, it becomes possible to dynamically translate these logs in real time. By using AI to enhance the speed and accuracy of the airline maintenance workflow, airlines stand to save billions according to the Harvard Business Review.