AI Face Korean

AI Face Korean — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • MLOps

    MLOps

    MLOps or ML Ops is a paradigm that aims to deploy and maintain machine learning models in production reliably and efficiently. It bridges the gap between machine learning development and production operations, ensuring that models are robust, scalable, and aligned with business goals. The word is a compound of "machine learning" and the continuous delivery practice (CI/CD) of DevOps in the software field. Machine learning models are tested and developed in isolated experimental systems. When an algorithm is ready to be launched, MLOps is practiced between data scientists, DevOps, and machine learning engineers to transition the algorithm to production systems. Similar to DevOps or DataOps approaches, MLOps seeks to increase automation and improve the quality of production models, while also focusing on business and regulatory requirements. While MLOps started as a set of best practices, it is slowly evolving into an independent approach to ML lifecycle management. MLOps applies to the entire lifecycle - from integrating with model generation (software development lifecycle, continuous integration/continuous delivery), orchestration, and deployment, to health, diagnostics, governance, and business metrics. == Definition == MLOps is a paradigm, including aspects like best practices, sets of concepts, as well as a development culture when it comes to the end-to-end conceptualization, implementation, monitoring, deployment, and scalability of machine learning products. Most of all, it is an engineering practice that leverages three contributing disciplines: machine learning, software engineering (especially DevOps), and data engineering. MLOps is aimed at productionizing machine learning systems by bridging the gap between development (Dev) and operations (Ops). Essentially, MLOps aims to facilitate the creation of machine learning products by leveraging these principles: CI/CD automation, workflow orchestration, reproducibility; versioning of data, model, and code; collaboration; continuous ML training and evaluation; ML metadata tracking and logging; continuous monitoring; and feedback loops. == History == Interest in operationalizing machine learning systems began to grow in the mid-2010s as ML projects started moving from experimentation to production use. The challenges associated with sustaining such systems were highlighted in a 2015 paper. The predicted growth in machine learning included an estimated doubling of ML pilots and implementations from 2017 to 2018, and again from 2018 to 2020. Reports show a majority (up to 88%) of corporate machine learning initiatives are struggling to move beyond test stages. However, those organizations that actually put machine learning into production saw a 3–15% profit margin increases. The MLOps market size was USD 2,191.8 Million in 2024, and is projected to be USD 16,613.4 Million in 2030. == Architecture == Machine Learning systems can be categorized in eight different categories: data collection, data processing, feature engineering, data labeling, model design, model training and optimization, endpoint deployment, and endpoint monitoring. Each step in the machine learning lifecycle is built in its own system, but requires interconnection. These are the minimum systems that enterprises need to scale machine learning within their organization. == Goals == There are a number of goals enterprises want to achieve through MLOps systems successfully implementing ML across the enterprise, including: Deployment and automation Reproducibility of models and predictions Diagnostics Governance and regulatory compliance Scalability Collaboration Business uses Monitoring and management A standard practice, such as MLOps, takes into account each of the aforementioned areas, which can help enterprises optimize workflows and avoid issues during implementation. Vendors such as Adaptive ML deliver commercial reinforcement learning operations (RLOps) and MLOps-infrastructure, targeting organizations deploying large language models in production. A common architecture of an MLOps system would include data science platforms where models are constructed and the analytical engines where computations are performed, with the MLOps tool orchestrating the movement of machine learning models, data and outcomes between the systems.

    Read more →
  • Encyclopaedistics

    Encyclopaedistics

    Encyclopaedistics or encyclopaedics as a discipline, is the academic scholarship of encyclopedias as sources of encyclopedic knowledge and cultural objects as well; in this sense, this discipline is also known as "encyclopaedia studies" and can be termed as "theoretical encyclopaediography" by analogy with theoretical lexicography. Encyclopaedistics as a practical activity (profession or business) also called "encyclopaedic practice" or "encyclopedism" is the process of assembling encyclopaedias available to the public for sale or for free (encyclopaedia publishing or practical encyclopediography). In this sense, it is the art or craft of writing, compiling, and editing the paper or online encyclopedias. As a practical activity, encyclopaedistics originated in the Middle Ages in connection with the development of compendiums based on alphabetical structuring (e.g. first edition of Polyanthea by Dominicus Nanus Mirabellius). Encyclopaedistics is often defined as "the art and science of selecting and disseminating the information most significant to mankind". == Field of study == Encyclopaedistics is a specialized aspect of information science and communication science. At the same time, encyclopaedistics is also considered as one of scholarly disciplines which are seen as auxiliary for historical research (auxiliary sciences of history) . Third, encyclopaedics is a domain of philosophy (Romanticism). This term associated with German philosophers of the 18th century, such as Novalis, Friedrich Schlegel, who sought to create a "Scientific Bible" - both real and ideal book as the quintessence of human education (enlightenment). In any case, the most popular topics in encyclopaedia studies refferd the history of organization of encyclopaedic knowledge, encyclopaedic knowledge determination and selection, glossary composition, current state of development of encyclopaedic activity, features of making encyclopaedias and encyclopaedic articles, usage, role and significance of encyclopaedias, typology of encyclopaedic literature, encyclopaedists and encyclopaedic schools, opposition of classical encyclopaedias and Wikipedia as well as paper encyclopaedias and online encyclopaedias, case experience in building encyclopedias etc. In general, scholarly studies contribute to appearance of successful well-crafted encyclopaedias with high-quality articles. == Contemporary encyclopaedic practice == Today, academic institutions, universities, and publishing companies worldwide are engaged in encyclopaedic activity building national, multinational (universal), regional and subject-specific encyclopaedias, or doing studies related encyclopaedias. The development of national encyclopaedias is one of the prerogatives of the European Parliament in the policy of protection of accurate and verified information and in the fight against mis- and disinformation as well as in the policy of protecting, promoting and projecting Europe's values and interests in the world.

    Read more →
  • AI: When a Robot Writes a Play

    AI: When a Robot Writes a Play

    AI: When a Robot Writes a Play (in Czech: AI: Když robot píše hru) is a 2021 experimental theatre play, where 90% of its script was automatically generated by artificial intelligence (the GPT-2 language model). The play is in Czech language, but an English version of the script also exists. == Creation == The play is the first result of the THEaiTRE research project, aiming to commemorate the centenary of the R.U.R. play by Karel Čapek by investigating to what extent artificial intelligence could be used to create theatre play scripts. The script of the play was created using the THEaiTRobot tool, based on the GPT-2 language model. First, the play dramaturge, David Košťák, described the initial setting of each scene in a few sentences, and wrote the first line for each character. Next, THEaiTRobot suggested a continuation of the script, which the dramaturge could use, reject, or use part of it and let the tool generate a new continuation. Another option was to manually insert another line or a scenic remark. The script was generated in English and was automatically translated to Czech by the state-of-the-art CUBBITT machine translation tool. The resulting script was then further post-edited by the dramaturge. The resulting script was made freely available for non-commercial use both in English and in Czech, with marked manually inserted texts and manual edits. The analysis shows that 90% of the English script is automatically generated, with 10% manually written or manually post-edited. In the Czech script, a larger amount of edits were made, but the analysis claims that these additional edits are corrections of errors of the automated translation and stylistic corrections which do not change the meaning of the lines as represented by the English script, but rather bring the Czech script closer to the English one. == Characters == The play contains 9 characters. The Robot appears in all the scenes, while each of the other characters appears in only one scene. Robot – The lead character, a male humanoid robot. Master – An old man, the creator of the Robot. Boy – A schoolboy. Masseuse – A sex worker in a brothel. Stranger – An engineer. Man. Psychologist. Administrator – A female clerk at an employment agency. Actress – A film actress and a model in a robot-like costume. == Plot == The play is composed of 8 scenes. It tells the story of a humanoid robot, who encounters 8 other characters and engages into various typically human situations and activities, related to death, love, sex, violence, etc. The individual scenes are not tightly linked, but there are some linking points, such as the central character of the robot or some repeated and developing themes, such as the robot's search for love. The scenes often contain some absurd turns and it is often hard to find sense in them. It is therefore a very complicated piece interpretationally, requiring the director and the actors to invest a lot of effort and creativity in finding a meaningful interpretation which would not deviate from the script. In the interpretation by Švanda theatre, who premiered the play and who also participated on the creation of the script, the scenes typically contain non-verbally expressed content which can add a lot to the meaning of the scene compared to what is contained in the actual script (as the script only contains the lines said by the characters). === Scene 1: Death === The play opens by the Robot parting with his dying Master. The Master gives the Robot several last lessons and talks with him about death, soul, and love. === Scene 2: Sense of Humour === In the second scene, the Robot meets a sad and angry Boy, who complains that he wants to go to school, that his girlfriend is crazy, that he wants to buy a car, etc. The Robot tries to help the Boy by giving him advice, but the Boy's reactions are quite negative and irritated. The Boy then repeatedly asks the Robot to tell him a joke; the Robot keeps refusing, but ultimately tells the following joke: When you are dead. When your children are dead. When your grandchildren are dead, I will be still alive. === Scene 3: Nightclub === The Robot wants to feel pleasure, so he goes to a "night club" (a brothel), where he meets a "Masseuse" (a prostitute). The Robot is initially "a bit cold", but eventually manages to enjoy the experience and falls in love with the Masseuse. In the Švanda theatre performance, the Robot and the Masseuse seem to have a sort of virtual sex without touching each other, reminiscent of the sex scene in Demolition Man. === Scene 4: Fear of the Dark === It is the night. The Robot is standing under a lamp, unable to move away from the light as he finds that he is afraid of the dark. He meets a Stranger, an engineer who tells him that robots don't have feelings and that people cannot be trusted, and keeps hurting him. In the Švanda theatre performance, the Man repeatedly zaps the Robot with some kind of electric pulse. === Scene 5: Killer Robot === A Man approaches the Robot and repeatedly asks him to kill him. Instead, the Robot sticks a finger into the Man's anus, which leads to an argument between the Man and the Robot. === Scene 6: Burn Out === The Robot meets a Psychologist, who keeps asking him lots of questions regarding his life, burnout feeling, love, relationships, and emotions. They also talk about the Robot using a device called emotion machine which helps him to get rid of stress. === Scene 7: Search for Job === The Robot comes to an employment agency. He meets an Administrator and asks her to help him find a job. He expresses the wish to become an actor, and talks about his experience as a clown. He reveals his name to be Troy McClure, which is a character from The Simpsons who is an actor. In the Švanda theatre performance, the Administrator starts to seduce the Robot once his name is revealed, which he keeps ignoring; the Administrator then becomes irritated. === Scene 8: Love at First Sight === The Robot meets a human Actress in a robotic costume and falls in love with her immediately. The Actress is first reluctant, but the Robot manages to seduce her and she also falls in love with him. The Robot tells her about a binary world, in which he lives and where he will also take her. Ultimately, the Actress agrees, and the whole play concludes by the Robot and the Actress promising each to other to always be together. In the Švanda theatre performance, the Robot does not have a physical body in this scene, we can only hear his voice and see a pulsating light (based on the line in the script where the Robot says: "I have no body. So I don't need to wear clothes. You can't see me, you only hear me."), and the Actress eventually also agrees to lose her physical body so that she can be with the Robot forever. == Theatrical performances == The play premiered on 26 February 2021 in Švanda Theatre in Prague, Czech Republic, directed by Daniel Hrbek. Due to the COVID-19 pandemic, the play was not played in front of a live audience, but it was broadcast online, in Czech language with English subtitles. The play was followed by a panel discussion by the project members and experts on artificial intelligence. The premiere was viewed by 13,498 spectators worldwide. A short trailer of the premiere is available on YouTube. In 2021, after the opening of the theatres in the Czech Republic to spectators, the play can be viewed at Švanda Theatre. The performance takes approximately 60 minutes, and is followed by a discussion of the creators with the audience. The derniere is planned for 4 February 2023. == Reception == The play received a number of reviews, both in its country of origin as well as internationally. It is praised as first of its kind, although some reviewers note the similarity to previous works, such as the musical Beyond the Fence, the play Lifestyle of the Richard and Family, or the short movie Sunspring; however, these works used less advanced technology, and either were very short (Sunspring) or necessitated a larger amount of human interventions. The reviewers note that the script is far from perfect, with many inconsistencies and nonsensical parts, and conclude that the technology is definitely not yet ready to replace human authors; however, some find some parts of the script frighteningly human-like. The amount of human intervention is a somewhat controversial topic, with some reviewers finding the human influence too large (especially in interpreting the script and putting the play on scene), while others feel that a greater amount of human intervention would have been favorable as this could greatly improve the quality of the play. The reviews also frequently comment on the amount of sex, violence and strong language in the play; this can be attributed to the method used for creating the script, where the GPT-2 language model reflects topics and language common in the human-written articles on the internet that were used to train the model. Furthermore, some r

    Read more →
  • Single-source publishing

    Single-source publishing

    Single-source publishing, also known as single-sourcing publishing, is a content management method which allows the same source content to be used across different forms of media and more than one time. The labor-intensive and expensive work of editing need only be carried out once, on only one document; that source document (the single source of truth) can then be stored in one place and reused. This reduces the potential for error, as corrections are only made one time in the source document. The benefits of single-source publishing primarily relate to the editor rather than the user. The user benefits from the consistency that single-sourcing brings to terminology and information. This assumes the content manager has applied an organized conceptualization to the underlying content (A poor conceptualization can make single-source publishing less useful). Single-source publishing is sometimes used synonymously with multi-channel publishing though whether or not the two terms are synonymous is a matter of discussion. == Definition == While there is a general definition of single-source publishing, there is no single official delineation between single-source publishing and multi-channel publishing, nor are there any official governing bodies to provide such a delineation. Single-source publishing is most often understood as the creation of one source document in an authoring tool and converting that document into different file formats or human languages (or both) multiple times with minimal effort. Multi-channel publishing can either be seen as synonymous with single-source publishing, or similar in that there is one source document but the process itself results in more than a mere reproduction of that source. == History == The origins of single-source publishing lie, indirectly, with the release of Windows 3.0 in 1990. With the eclipsing of MS-DOS by graphical user interfaces, help files went from being unreadable text along the bottom of the screen to hypertext systems such as WinHelp. On-screen help interfaces allowed software companies to cease the printing of large, expensive help manuals with their products, reducing costs for both producer and consumer. This system raised opportunities as well, and many developers fundamentally changed the way they thought about publishing. Writers of software documentation did not simply move from being writers of traditional bound books to writers of electronic publishing, but rather they became authors of central documents which could be reused multiple times across multiple formats. The first single-source publishing project was started in 1993 by Cornelia Hofmann at Schneider Electric in Seligenstadt, using software based on Interleaf to automatically create paper documentation in multiple languages based on a single original source file. XML, developed during the mid- to late-1990s, was also significant to the development of single-source publishing as a method. XML, a markup language, allows developers to separate their documentation into two layers: a shell-like layer based on presentation and a core-like layer based on the actual written content. This method allows developers to write the content only one time while switching it in and out of multiple different formats and delivery methods. In the mid-1990s, several firms began creating and using single-source content for technical documentation (Boeing Helicopter, Sikorsky Aviation and Pratt & Whitney Canada) and user manuals (Ford owners manuals) based on tagged SGML and XML content generated using the Arbortext Epic editor with add-on functions developed by a contractor. The concept behind this usage was that complex, hierarchical content that did not lend itself to discrete componentization could be used across a variety of requirements by tagging the differences within a single document using the capabilities built into SGML and XML. Ford, for example, was able to tag its single owner's manual files so that 12 model years could be generated via a resolution script running on the single completed file. Pratt & Whitney, likewise, was able to tag up to 20 subsets of its jet engine manuals in single-source files, calling out the desired version at publication time. World Book Encyclopedia also used the concept to tag its articles for American and British versions of English. Starting from the early 2000s, single-source publishing was used with an increasing frequency in the field of technical translation. It is still regarded as the most efficient method of publishing the same material in different languages. Once a printed manual was translated, for example, the online help for the software program which the manual accompanies could be automatically generated using the method. Metadata could be created for an entire manual and individual pages or files could then be translated from that metadata with only one step, removing the need to recreate information or even database structures. Although single-source publishing is now decades old, its importance has increased urgently as of the 2010s. As consumption of information products rises and the number of target audiences expands, so does the work of developers and content creators. Within the industry of software and its documentation, there is a perception that the choice is to embrace single-source publishing or render one's operations obsolete. == Criticism == Editors using single-source publishing have been criticized for below-standard work quality, leading some critics to describe single-source publishing as the "conveyor belt assembly" of content creation. While heavily used in technical translation, there are risks of error in regard to indexing. While two words might be synonyms in English, they may not be synonyms in another language. In a document produced via single-sourcing, the index will be translated automatically and the two words will be rendered as synonyms. This is because they are synonyms in the source language, while in the target language they are not.

    Read more →
  • SUPS

    SUPS

    In computational neuroscience, SUPS (for Synaptic Updates Per Second) or formerly CUPS (Connections Updates Per Second) is a measure of a neuronal network performance, useful in fields of neuroscience, cognitive science, artificial intelligence, and computer science. == Computing == For a processor or computer designed to simulate a neural network SUPS is measured as the product of simulated neurons N {\displaystyle N} and average connectivity c {\displaystyle c} (synapses) per neuron per second: S U P S = c × N {\displaystyle SUPS=c\times N} Depending on the type of simulation it is usually equal to the total number of synapses simulated. In an "asynchronous" dynamic simulation if a neuron spikes at υ {\displaystyle \upsilon } Hz, the average rate of synaptic updates provoked by the activity of that neuron is υ c N {\displaystyle \upsilon cN} . In a synchronous simulation with step Δ t {\displaystyle \Delta t} the number of synaptic updates per second would be c N Δ t {\displaystyle {\frac {cN}{\Delta t}}} . As Δ t {\displaystyle \Delta t} has to be chosen much smaller than the average interval between two successive afferent spikes, which implies Δ t < 1 υ N {\displaystyle \Delta t<{\frac {1}{\upsilon N}}} , giving an average of synaptic updates equal to υ c N 2 {\displaystyle \upsilon cN^{2}} . Therefore, spike-driven synaptic dynamics leads to a linear scaling of computational complexity O(N) per neuron, compared with the O(N2) in the "synchronous" case. == Records == Developed in the 1980s Adaptive Solutions' CNAPS-1064 Digital Parallel Processor chip is a full neural network (NNW). It was designed as a coprocessor to a host and has 64 sub-processors arranged in a 1D array and operating in a SIMD mode. Each sub-processor can emulate one or more neurons and multiple chips can be grouped together. At 25 MHz it is capable of 1.28 GMAC. After the presentation of the RN-100 (12 MHz) single neuron chip at Seattle 1991 Ricoh developed the multi-neuron chip RN-200. It had 16 neurons and 16 synapses per neuron. The chip has on-chip learning ability using a proprietary backdrop algorithm. It came in a 257-pin PGA encapsulation and drew 3.0 W at a maximum. It was capable of 3 GCPS (1 GCPS at 32 MHz). In 1991–97, Siemens developed the MA-16 chip, SYNAPSE-1 and SYNAPSE-3 Neurocomputer. The MA-16 was a fast matrix-matrix multiplier that can be combined to form systolic arrays. It could process 4 patterns of 16 elements each (16-bit), with 16 neuron values (16-bit) at a rate of 800 MMAC or 400 MCPS at 50 MHz. The SYNAPSE3-PC PCI card contained 2 MA-16 with a peak performance of 2560 MOPS (1.28 GMAC); 7160 MOPS (3.58 GMAC) when using three boards. In 2013, the K computer was used to simulate a neural network of 1.73 billion neurons with a total of 10.4 trillion synapses (1% of the human brain). The simulation ran for 40 minutes to simulate 1 s of brain activity at a normal activity level (4.4 on average). The simulation required 1 Petabyte of storage.

    Read more →
  • Ontology merging

    Ontology merging

    Ontology merging defines the act of bringing together two conceptually divergent ontologies or the instance data associated to two ontologies. This is similar to work in database merging (schema matching). This merging process can be performed in a number of ways, manually, semi automatically, or automatically. Manual ontology merging although ideal is extremely labour-intensive and current research attempts to find semi or entirely automated techniques to merge ontologies. These techniques are statistically driven often taking into account similarity of concepts and raw similarity of instances through textual string metrics and semantic knowledge. These techniques are similar to those used in information integration employing string metrics from open source similarity libraries.

    Read more →
  • Ordered key–value store

    Ordered key–value store

    An ordered key–value store (OKVS) is a type of data storage paradigm that can support multi-model databases. An OKVS is an ordered mapping of bytes to bytes. An OKVS will keep the key–value pairs sorted by the key lexicographic order. OKVS systems provides different set of features and performance trade-offs. Most of them are shipped as a library without network interfaces, in order to be embedded in another process. Most OKVS support ACID guarantees. Some OKVS are distributed databases. Ordered key–value stores found their way into many modern database systems including NewSQL database systems. == History == The origin of ordered key–value store stems from the work of Ken Thompson on dbm in 1979. Later in 1991, Berkeley DB was released that featured a B-Tree backend that allowed the keys to stay sorted. Berkeley DB was said to be very fast and made its way into various commercial product. It was included in Python standard library until 2.7. In 2009, Tokyo Cabinet was released that was superseded by Kyoto Cabinet that support both transaction and ordered keys. In 2011, LMDB was created to replace Berkeley DB in OpenLDAP. There is also Google's LevelDB that was forked by Facebook in 2012 as RocksDB. In 2014, WiredTiger, successor of Berkeley DB was acquired by MongoDB and is since 2019 the primary backend of MongoDB database. Other notable implementation of the OKVS paradigm are Sophia and SQLite3 LSM extension. Another notable use of OKVS paradigm is the multi-model database system called ArangoDB based on RocksDB. Some NewSQL databases are supported by ordered key–value stores. JanusGraph, a property graph database, has both a Berkeley DB backend and FoundationDB backend. == Key concepts == === Lexicographic encoding === There are algorithms that encode basic data types (boolean, string, number) and composition of those data types inside sorted containers (tuple, list, vector) that preserve their natural ordering. It is possible to work with an ordered key–value store without having to work directly with bytes. In FoundationDB, it is called the tuple layer. === Range query === Inside an OKVS, keys are ordered, and because of that it is possible to do range queries. A range query retrieves all keys between two specified keys, ensuring that the fetched keys are returned in a sorted order. === Subspaces === === Key composition === One can construct key spaces to build higher level abstractions. The idea is to construct keys, that takes advantage of the ordered nature of the top level key space. When taking advantage of the ordered nature of the key space, one can query ranges of keys that have particular pattern. === Denormalization === Denormalization, as in, repeating the same piece of data in multiple subspace is common practice. It allows to create secondary representation, also called indices, that will allow to speed up queries. == Higher level abstractions == The following abstraction or databases were built on top ordered key–value stores: Timeseries database, Record Database, also known as Row store databases, they behave similarly to what is dubbed RDBMS, Tuple Stores, also known as Triple Store or Quad Store but also Generic Tuple Store, Document database, that mimics MongoDB API, Full-text search Geographic Information Systems Property Graph Versioned Data Vector space database for Approximate Nearest Neighbor All those abstraction can co-exist with the same OKVS database and when ACID is supported, the operations happens with the guarantees offered by the transaction system. == Feature matrix == == Use-cases == OKVS are useful to implement two strategies: optimize a small feature e.g. to make a 10% improvement in read or write latency; the second strategy is to take advantage of the distributed nature of FoundationDB, and TiKV, for which there is no equivalent at very large scale in resilience. Both users need to re-implement the needed high level abstractions, because there are no portable ready-to-use libraries of high-level abstraction. There is still a complex balance, of complexity, maintainability, fine-tuning, and readily available features that makes it still a choice of experts. Sometime more specialized data-structures can be faster than a high-level abstraction on top of an OKVS. Another interest of OKVS paradigm stems from it simple, and versatile interface, that makes it an interesting target for experimental storage algorithms, and data structures.

    Read more →
  • Behavior selection algorithm

    Behavior selection algorithm

    In artificial intelligence, a behavior selection algorithm, or action selection algorithm, is an algorithm that selects appropriate behaviors or actions for one or more intelligent agents. In game artificial intelligence, it selects behaviors or actions for one or more non-player characters. Common behavior selection algorithms include: Finite-state machines Hierarchical finite-state machines Decision trees Behavior trees Hierarchical task networks Hierarchical control systems Utility systems Dialogue tree (for selecting what to say) == Related concepts == In application programming, run-time selection of the behavior of a specific method is referred to as the strategy design pattern.

    Read more →
  • Cross-entropy method

    Cross-entropy method

    The cross-entropy (CE) method is a Monte Carlo method for importance sampling and optimization. It is applicable to both combinatorial and continuous problems, with either a static or noisy objective. The method approximates the optimal importance sampling estimator by repeating two phases: Draw a sample from a probability distribution. Minimize the cross-entropy between this distribution and a target distribution to produce a better sample in the next iteration. Reuven Rubinstein developed the method in the context of rare-event simulation, where tiny probabilities must be estimated, for example in network reliability analysis, queueing models, or performance analysis of telecommunication systems. The method has also been applied to the traveling salesman, quadratic assignment, DNA sequence alignment, max-cut and buffer allocation problems. == Estimation via importance sampling == Consider the general problem of estimating the quantity ℓ = E u [ H ( X ) ] = ∫ H ( x ) f ( x ; u ) d x {\displaystyle \ell =\mathbb {E} _{\mathbf {u} }[H(\mathbf {X} )]=\int H(\mathbf {x} )\,f(\mathbf {x} ;\mathbf {u} )\,{\textrm {d}}\mathbf {x} } , where H {\displaystyle H} is some performance function and f ( x ; u ) {\displaystyle f(\mathbf {x} ;\mathbf {u} )} is a member of some parametric family of distributions. Using importance sampling this quantity can be estimated as ℓ ^ = 1 N ∑ i = 1 N H ( X i ) f ( X i ; u ) g ( X i ) {\displaystyle {\hat {\ell }}={\frac {1}{N}}\sum _{i=1}^{N}H(\mathbf {X} _{i}){\frac {f(\mathbf {X} _{i};\mathbf {u} )}{g(\mathbf {X} _{i})}}} , where X 1 , … , X N {\displaystyle \mathbf {X} _{1},\dots ,\mathbf {X} _{N}} is a random sample from g {\displaystyle g\,} . For positive H {\displaystyle H} , the theoretically optimal importance sampling density (PDF) is given by g ∗ ( x ) = H ( x ) f ( x ; u ) / ℓ {\displaystyle g^{}(\mathbf {x} )=H(\mathbf {x} )f(\mathbf {x} ;\mathbf {u} )/\ell } . This, however, depends on the unknown ℓ {\displaystyle \ell } . The CE method aims to approximate the optimal PDF by adaptively selecting members of the parametric family that are closest (in the Kullback–Leibler sense) to the optimal PDF g ∗ {\displaystyle g^{}} . == Generic CE algorithm == Choose initial parameter vector v ( 0 ) {\displaystyle \mathbf {v} ^{(0)}} ; set t = 1. Generate a random sample X 1 , … , X N {\displaystyle \mathbf {X} _{1},\dots ,\mathbf {X} _{N}} from f ( ⋅ ; v ( t − 1 ) ) {\displaystyle f(\cdot ;\mathbf {v} ^{(t-1)})} Solve for v ( t ) {\displaystyle \mathbf {v} ^{(t)}} , where v ( t ) = argmax v ⁡ 1 N ∑ i = 1 N H ( X i ) f ( X i ; u ) f ( X i ; v ( t − 1 ) ) log ⁡ f ( X i ; v ) {\displaystyle \mathbf {v} ^{(t)}=\mathop {\textrm {argmax}} _{\mathbf {v} }{\frac {1}{N}}\sum _{i=1}^{N}H(\mathbf {X} _{i}){\frac {f(\mathbf {X} _{i};\mathbf {u} )}{f(\mathbf {X} _{i};\mathbf {v} ^{(t-1)})}}\log f(\mathbf {X} _{i};\mathbf {v} )} If convergence is reached then stop; otherwise, increase t by 1 and reiterate from step 2. In several cases, the solution to step 3 can be found analytically. Situations in which this occurs are When f {\displaystyle f\,} belongs to the natural exponential family When f {\displaystyle f\,} is discrete with finite support When H ( X ) = I { x ∈ A } {\displaystyle H(\mathbf {X} )=\mathrm {I} _{\{\mathbf {x} \in A\}}} and f ( X i ; u ) = f ( X i ; v ( t − 1 ) ) {\displaystyle f(\mathbf {X} _{i};\mathbf {u} )=f(\mathbf {X} _{i};\mathbf {v} ^{(t-1)})} , then v ( t ) {\displaystyle \mathbf {v} ^{(t)}} corresponds to the maximum likelihood estimator based on those X k ∈ A {\displaystyle \mathbf {X} _{k}\in A} . == Continuous optimization—example == The same CE algorithm can be used for optimization, rather than estimation. Suppose the problem is to maximize some function S {\displaystyle S} , for example, S ( x ) = e − ( x − 2 ) 2 + 0.8 e − ( x + 2 ) 2 {\displaystyle S(x)={\textrm {e}}^{-(x-2)^{2}}+0.8\,{\textrm {e}}^{-(x+2)^{2}}} . To apply CE, one considers first the associated stochastic problem of estimating P θ ( S ( X ) ≥ γ ) {\displaystyle \mathbb {P} _{\boldsymbol {\theta }}(S(X)\geq \gamma )} for a given level γ {\displaystyle \gamma \,} , and parametric family { f ( ⋅ ; θ ) } {\displaystyle \left\{f(\cdot ;{\boldsymbol {\theta }})\right\}} , for example the 1-dimensional Gaussian distribution, parameterized by its mean μ t {\displaystyle \mu _{t}\,} and variance σ t 2 {\displaystyle \sigma _{t}^{2}} (so θ = ( μ , σ 2 ) {\displaystyle {\boldsymbol {\theta }}=(\mu ,\sigma ^{2})} here). Hence, for a given γ {\displaystyle \gamma \,} , the goal is to find θ {\displaystyle {\boldsymbol {\theta }}} so that D K L ( I { S ( x ) ≥ γ } ‖ f θ ) {\displaystyle D_{\mathrm {KL} }({\textrm {I}}_{\{S(x)\geq \gamma \}}\|f_{\boldsymbol {\theta }})} is minimized. This is done by solving the sample version (stochastic counterpart) of the KL divergence minimization problem, as in step 3 above. It turns out that parameters that minimize the stochastic counterpart for this choice of target distribution and parametric family are the sample mean and sample variance corresponding to the elite samples, which are those samples that have objective function value ≥ γ {\displaystyle \geq \gamma } . The worst of the elite samples is then used as the level parameter for the next iteration. This yields the following randomized algorithm that happens to coincide with the so-called Estimation of Multivariate Normal Algorithm (EMNA), an estimation of distribution algorithm. === Pseudocode === // Initialize parameters μ := −6 σ2 := 100 t := 0 maxits := 100 N := 100 Ne := 10 // While maxits not exceeded and not converged while t < maxits and σ2 > ε do // Obtain N samples from current sampling distribution X := SampleGaussian(μ, σ2, N) // Evaluate objective function at sampled points S := exp(−(X − 2) ^ 2) + 0.8 exp(−(X + 2) ^ 2) // Sort X by objective function values in descending order X := sort(X, S) // Update parameters of sampling distribution via elite samples μ := mean(X(1:Ne)) σ2 := var(X(1:Ne)) t := t + 1 // Return mean of final sampling distribution as solution return μ == Related methods == Simulated annealing Genetic algorithms Harmony search Estimation of distribution algorithm Tabu search Natural Evolution Strategy Ant colony optimization algorithms

    Read more →
  • Information architecture

    Information architecture

    Information architecture is the structural design of shared information environments, in particular the organisation of websites and software to support usability and findability. The term information architecture was coined by Richard Saul Wurman. Since its inception, information architecture has become an emerging community of practice focused on applying principles of design, architecture and information science in digital spaces. Typically, a model or concept of information is used and applied to activities which require explicit details of complex information systems. These activities include library systems and database development. == Definition == The term information architecture has different meanings in different branches of information systems or information technology. === User experience === In user experience design, information architecture has been described as the structural design of shared information environments, comprising the study and practice of organising and labelling web sites, intranets, online communities, and software to support user experience, in particular, the findability and usability of information. It has also been described as an emerging community of practice focused on bringing principles of design and architecture to the digital landscape. === Information systems === Technically speaking, information architecture comprises the combination of organization, labeling, search and navigation systems within websites and intranets, serving as a navigational aid to the content of information-rich systems. === Data architecture === Information architecture can be described as a subset of data architecture where usable data is constructed, designed, and arranged in a fashion most useful to the users of data. === Systems design === In the field of systems design, for example, information architecture is a component of enterprise architecture that deals with the information component when describing the structure of an enterprise. Some system design practitioners regard information architecture as strictly the application of information science to web design, which considers such issues as classification and information retrieval, and not factors like user experience and information design. == Principles == Principles of information architecture include the following: The principle of objects The principle of choices The principle of disclosure The principle of exemplars The principle of front doors The principle of multiple classification The principle of focused navigation The principle of growth == History == Richard Saul Wurman is credited with coining the term information architecture in relation to the design of information. From 1998 to 2015, Peter Morville and Louis Rosenfeld were co-authors of Information Architecture for the World Wide Web. Other authors include Jesse James Garrett and Christina Wodtke.

    Read more →
  • XOR swap algorithm

    XOR swap algorithm

    In computer programming, the exclusive or swap (sometimes shortened to XOR swap) is an algorithm that uses the exclusive or bitwise operation to swap the values of two variables without using the temporary variable which is normally required. The algorithm is primarily a novelty and a way of demonstrating properties of the exclusive or operation. It is sometimes discussed as a program optimization, but there are almost no cases where swapping via exclusive or provides benefit over the standard, obvious technique. == The algorithm == Conventional swapping requires the use of a temporary storage variable. Using the XOR swap algorithm, however, no temporary storage is needed. The algorithm is as follows: Since XOR is a commutative operation, either X XOR Y or Y XOR X can be used interchangeably in any of the foregoing three lines. Note that on some architectures the first operand of the XOR instruction specifies the target location at which the result of the operation is stored, preventing this interchangeability. The algorithm typically corresponds to three machine-code instructions, represented by corresponding pseudocode and assembly instructions in the three rows of the following table: In the above System/370 assembly code sample, R1 and R2 are distinct registers, and each XR operation leaves its result in the register named in the first argument. Using x86 assembly, values X and Y are in registers eax and ebx (respectively), and xor places the result of the operation in the first register (Note: x86 supports XCHG instruction so using triple XOR do not make sense on this architecture). In RISC-V assembly, value X and Y are in registers x10 and x11, and xor places the result of the operation in the first operand. However, in the pseudocode or high-level language version or implementation, the algorithm fails if x and y use the same storage location, since the value stored in that location will be zeroed out by the first XOR instruction, and then remain zero; it will not be "swapped with itself". This is not the same as if x and y have the same values. The trouble only comes when x and y use the same storage location, in which case their values must already be equal. That is, if x and y use the same storage location, then the line: sets x to zero (because x = y so X XOR Y is zero) and sets y to zero (since it uses the same storage location), causing x and y to lose their original values. == Proof of correctness == The binary operation XOR over bit strings of length N {\displaystyle N} exhibits the following properties (where ⊕ {\displaystyle \oplus } denotes XOR): L1. Commutativity: A ⊕ B = B ⊕ A {\displaystyle A\oplus B=B\oplus A} L2. Associativity: ( A ⊕ B ) ⊕ C = A ⊕ ( B ⊕ C ) {\displaystyle (A\oplus B)\oplus C=A\oplus (B\oplus C)} L3. Identity exists: there is a bit string, 0, (of length N) such that A ⊕ 0 = A {\displaystyle A\oplus 0=A} for any A {\displaystyle A} L4. Each element is its own inverse: for each A {\displaystyle A} , A ⊕ A = 0 {\displaystyle A\oplus A=0} . Suppose that we have two distinct registers R1 and R2 as in the table below, with initial values A and B respectively. We perform the operations below in sequence, and reduce our results using the properties listed above. === Linear algebra interpretation === As XOR can be interpreted as binary addition and a pair of bits can be interpreted as a vector in a two-dimensional vector space over the field with two elements, the steps in the algorithm can be interpreted as multiplication by 2×2 matrices over the field with two elements. For simplicity, assume initially that x and y are each single bits, not bit vectors. For example, the step: which also has the implicit: corresponds to the matrix ( 1 1 0 1 ) {\displaystyle \left({\begin{smallmatrix}1&1\\0&1\end{smallmatrix}}\right)} as ( 1 1 0 1 ) ( x y ) = ( x + y y ) . {\displaystyle {\begin{pmatrix}1&1\\0&1\end{pmatrix}}{\begin{pmatrix}x\\y\end{pmatrix}}={\begin{pmatrix}x+y\\y\end{pmatrix}}.} The sequence of operations is then expressed as: ( 1 1 0 1 ) ( 1 0 1 1 ) ( 1 1 0 1 ) = ( 0 1 1 0 ) {\displaystyle {\begin{pmatrix}1&1\\0&1\end{pmatrix}}{\begin{pmatrix}1&0\\1&1\end{pmatrix}}{\begin{pmatrix}1&1\\0&1\end{pmatrix}}={\begin{pmatrix}0&1\\1&0\end{pmatrix}}} (working with binary values, so 1 + 1 = 0 {\displaystyle 1+1=0} ), which expresses the elementary matrix of switching two rows (or columns) in terms of the transvections (shears) of adding one element to the other. To generalize to where X and Y are not single bits, but instead bit vectors of length n, these 2×2 matrices are replaced by 2n×2n block matrices such as ( I n I n 0 I n ) . {\displaystyle \left({\begin{smallmatrix}I_{n}&I_{n}\\0&I_{n}\end{smallmatrix}}\right).} These matrices are operating on values, not on variables (with storage locations), hence this interpretation abstracts away from issues of storage location and the problem of both variables sharing the same storage location. == Code example == A C function that implements the XOR swap algorithm: The code first checks if the addresses are distinct and uses a guard clause to exit the function early if they are equal. Without that check, if they were equal, the algorithm would fold to a triple x ^= x resulting in zero. == Reasons for avoidance in practice == On modern CPU architectures, the XOR technique can be slower than using a temporary variable to do swapping. At least on recent x86 CPUs, both by AMD and Intel, moving between registers regularly incurs zero latency. (This is called MOV-elimination.) Even if there is not any architectural register available to use, the XCHG instruction will be at least as fast as the three XORs taken together. Another reason is that modern CPUs strive to execute instructions in parallel via instruction pipelines. In the XOR technique, the inputs to each operation depend on the results of the previous operation, so they must be executed in strictly sequential order, negating any benefits of instruction-level parallelism. === Aliasing === The XOR swap is also complicated in practice by aliasing. If an attempt is made to XOR-swap the contents of some location with itself, the result is that the location is zeroed out and its value lost. Therefore, XOR swapping must not be used blindly in a high-level language if aliasing is possible. This issue does not apply if the technique is used in assembly to swap the contents of two registers. Similar problems occur with call by name, as in Jensen's Device, where swapping i and A[i] via a temporary variable yields incorrect results due to the arguments being related: swapping via temp = i; i = A[i]; A[i] = temp changes the value for i in the second statement, which then results in the incorrect i value for A[i] in the third statement. == Variations == The underlying principle of the XOR swap algorithm can be applied to any operation meeting criteria L1 through L4 above. Replacing XOR by addition and subtraction gives various slightly different, but largely equivalent, formulations. For example: Unlike the XOR swap, this variation requires that the underlying processor or programming language uses a method such as modular arithmetic or bignums to guarantee that the computation of X + Y cannot cause an error due to integer overflow. Therefore, it is seen even more rarely in practice than the XOR swap. However, the implementation of AddSwap above in the C programming language always works even in case of integer overflow, since, according to the C standard, addition and subtraction of unsigned integers follow the rules of modular arithmetic, i. e. are done in the cyclic group Z / 2 s Z {\displaystyle \mathbb {Z} /2^{s}\mathbb {Z} } where s {\displaystyle s} is the number of bits of unsigned int. Indeed, the correctness of the algorithm follows from the fact that the formulas ( x + y ) − y = x {\displaystyle (x+y)-y=x} and ( x + y ) − ( ( x + y ) − y ) = y {\displaystyle (x+y)-((x+y)-y)=y} hold in any abelian group. This generalizes the proof for the XOR swap algorithm: XOR is both the addition and subtraction in the abelian group ( Z / 2 Z ) s {\displaystyle (\mathbb {Z} /2\mathbb {Z} )^{s}} (which is the direct sum of s copies of Z / 2 Z {\displaystyle \mathbb {Z} /2\mathbb {Z} } ). This doesn't hold when dealing with the signed int type (the default for int). Signed integer overflow is an undefined behavior in C and thus modular arithmetic is not guaranteed by the standard, which may lead to incorrect results. The sequence of operations in AddSwap can be expressed via matrix multiplication as: ( 1 − 1 0 1 ) ( 1 0 1 − 1 ) ( 1 1 0 1 ) = ( 0 1 1 0 ) {\displaystyle {\begin{pmatrix}1&-1\\0&1\end{pmatrix}}{\begin{pmatrix}1&0\\1&-1\end{pmatrix}}{\begin{pmatrix}1&1\\0&1\end{pmatrix}}={\begin{pmatrix}0&1\\1&0\end{pmatrix}}} == Application to register allocation == On architectures lacking a dedicated swap instruction, because it avoids the extra temporary register, the XOR swap algorithm is required for optimal register allocatio

    Read more →
  • Data drilling

    Data drilling

    Data drilling (also drilldown) refers to any of various operations and transformations on tabular, relational, and multidimensional data. The term has widespread use in various contexts, but is primarily associated with specialized software designed specifically for data analysis. == Common data drilling operations == There are certain operations that are common to applications that allow data drilling. Among them are: Query operations: tabular query pivot query === Tabular query === Tabular query operations consist of standard operations on data tables. Among these operations are: search sort filter (by value) filter (by extended function or condition) transform (e.g., by adding or removing columns) Consider the following example: Fred and Wilma table (Fig 001): gender, fname, lname, home male, fred, chopin, Poland male, fred, flintstone, bedrock male, fred, durst, usa female, wilma, flintstone, bedrock female, wilma, rudolph, usa female, wilma, webb, usa male, fred, johnson, usa The preceding is an example of a simple flat file table formatted as comma-separated values. The table includes first name, last name, gender and home country for various people named fred or wilma. Although the example is formatted this way, it is important to emphasize that tabular query operations (as well as all data drilling operations) can be applied to any conceivable data type, regardless of the underlying formatting. The only requirement is that the data be readable by the software application in use. === Pivot query === A pivot query allows multiple representations of data according to different dimensions. This query type is similar to tabular query, except it also allows data to be represented in summary format, according to a flexible user-selected hierarchy. This class of data drilling operation is formally, (and loosely) known by different names, including crosstab query, pivot table, data pilot, selective hierarchy, intertwingularity and others. To illustrate the basics of pivot query operations, consider the Fred and Wilma table (Fig 001). A quick scan of the data reveals that the table has redundant information. This redundancy could be consolidated using an outline or a tree structure or in some other way. Moreover, once consolidated, the data could have many different alternate layouts. Using a simple text outline as output, the following alternate layouts are all possible with a pivot query: Summarize by gender (Fig 001): female flintstone, wilma rudolph, wilma webb, wilma male chopin, fred flintstone, fred durst, fred johnson, fred (Dimensions = gender; Tabular fields = lname, fname;) Summarize by home, lname (Fig 001): bedrock flintstone fred wilma Poland chopin fred usa ... (Dimensions = home, lname; Tabular fields = fname;) ==== Uses ==== Pivot query operations are useful for summarizing a corpus of data in multiple ways, thereby illustrating different representations of the same basic information. Although this type of operation appears prominently in spreadsheets and desktop database software, its flexibility is arguably under-utilized. There are many applications that allow only a 'fixed' hierarchy for representing data, and this represents a substantial limitation. == Drillup == Drillup is the opposite of drilldown. For example, if you drilldown to see the revenue of one product, then you might want to drillup to see the revenue of all products.

    Read more →
  • Feng Office Community Edition

    Feng Office Community Edition

    Feng Office Community Edition (formerly OpenGoo) is an open-source collaboration platform developed and supported by Feng Office and the OpenGoo community. It is a fully featured online office suite with a similar set of features as other online office suites, like Google Workspace, Microsoft 365, Zimbra, LibreOffice Online and Zoho Office Suite. The application can be downloaded and installed on a server. Feng Office could also be categorized as collaborative software and as personal information manager software. == Features == Feng Office Community Edition main features include project management, document management, contact management, e-mail and time management. Text documents and presentations can be created and edited online. Files can be uploaded, organized and shared, independent of file formats. Organization of the information in Feng Office Community Edition is done using workspaces and tags. The application presents the information stored using different interfaces such as lists, dashboards and calendar views. == Licensing == Feng Office Community Edition is distributed under the GNU Affero General Public License, version 3 only. == Technology used == Feng Office uses PHP, JavaScript, AJAX (ExtJS) and MySQL technology. Several open source projects served as a basis for development. ActiveCollab's last open sourced release was used as the initial code base. It includes CKEditor for online document editing. == System requirements == The server could run on any operating system. The system needs the following packages: Apache HTTP Server 2.0+ PHP 5.0+ MySQL 4.1+ (InnoDB support recommended) On the client side, the user is only required to use a modern Web browser. == History == OpenGoo started as a degree project at the faculty of Engineering of the University of the Republic, Uruguay. The project was presented and championed by Software Engineer Conrado Viña. Software Engineers Marcos Saiz and Ignacio de Soto developed the first prototype as their thesis. Professors Eduardo Fernández and Tomas Laurenzo served as tutors. Conrado, Ignacio and Marcos founded the OpenGoo community and remain active members and core developers. The thesis was approved with the highest score. In 2008, Viña joined the Uruguayan software development company Moove It. Currently there is a second project for OpenGoo at the same university being developed by students Fernando Rodríguez, Ignacio Vázquez and Juan Pedro del Campo. Their project aims to build an open source Web-based spreadsheet. In December 2009 the OpenGoo name was changed to Feng Office Community Edition.

    Read more →
  • DONE

    DONE

    The Data-based Online Nonlinear Extremumseeker (DONE) algorithm is a black-box optimization algorithm. DONE models the unknown cost function and attempts to find an optimum of the underlying function. The DONE algorithm is suitable for optimizing costly and noisy functions and does not require derivatives. An advantage of DONE over similar algorithms, such as Bayesian optimization, is that the computational cost per iteration is independent of the number of function evaluations. == Methods == The DONE algorithm was first proposed by Hans Verstraete and Sander Wahls in 2015. The algorithm fits a surrogate model based on random Fourier features and then uses a well-known L-BFGS algorithm to find an optimum of the surrogate model. == Applications == DONE was first demonstrated for maximizing the signal in optical coherence tomography measurements, but has since then been applied to various other applications. For example, it was used to help extending the field of view in light sheet fluorescence microscopy.

    Read more →
  • Semantic query

    Semantic query

    Semantic queries allow for queries and analytics of associative and contextual nature. Semantic queries enable the retrieval of both explicitly and implicitly derived information based on syntactic, semantic and structural information contained in data. They are designed to deliver precise results (possibly the distinctive selection of one single piece of information) or to answer more fuzzy and wide open questions through pattern matching and digital reasoning. Semantic queries work on named graphs, linked data or triples. This enables the query to process the actual relationships between information and infer the answers from the network of data. This is in contrast to semantic search, which uses semantics (meaning of language constructs) in unstructured text to produce a better search result. (See natural language processing.) From a technical point of view, semantic queries are precise relational-type operations much like a database query. They work on structured data and therefore have the possibility to utilize comprehensive features like operators (e.g. >, < and =), namespaces, pattern matching, subclassing, transitive relations, semantic rules and contextual full text search. The semantic web technology stack of the W3C is offering SPARQL to formulate semantic queries in a syntax similar to SQL. Semantic queries are used in triplestores, graph databases, semantic wikis, natural language and artificial intelligence systems. == Background == Relational databases represent all relationships between data in an implicit manner only. For example, the relationships between customers and products (stored in two content-tables and connected with an additional link-table) only come into existence in a query statement (SQL in the case of relational databases) written by a developer. Writing the query demands exact knowledge of the database schema. Linked-Data represent all relationships between data in an explicit manner. In the above example, no query code needs to be written. The correct product for each customer can be fetched automatically. Whereas this simple example is trivial, the real power of linked-data comes into play when a network of information is created (customers with their geo-spatial information like city, state and country; products with their categories within sub- and super-categories). Now the system can automatically answer more complex queries and analytics that look for the connection of a particular location with a product category. The development effort for this query is omitted. Executing a semantic query is conducted by walking the network of information and finding matches (also called Data Graph Traversal). Another important aspect of semantic queries is that the type of the relationship can be used to incorporate intelligence into the system. The relationship between a customer and a product has a fundamentally different nature than the relationship between a neighbourhood and its city. The latter enables the semantic query engine to infer that a customer living in Manhattan is also living in New York City whereas other relationships might have more complicated patterns and "contextual analytics". This process is called inference or reasoning and is the ability of the software to derive new information based on given facts. == Articles == Velez, Golda (2008). "Semantics Help Wall Street Cope With Data Overload". Wall Street & Technology. wallstreetandtech.com. Zhifeng, Xiao (2009). "Spatial information semantic query based on SPARQL". In Liu, Yaolin; Tang, Xinming (eds.). International Symposium on Spatial Analysis, Spatial-Temporal Data Modeling, and Data Mining. Vol. 7492. SPIE. pp. 74921P. Bibcode:2009SPIE.7492E..60X. doi:10.1117/12.838556. S2CID 62191842. Aquin, Mathieu (2010). "Watson, more than a Semantic Web search engine" (PDF). Semantic Web Journal. Dworetzky, Tom (2011). "How Siri Works: iPhone's 'Brain' Comes from Natural Language Processing". International Business Times. Horwitt, Elisabeth (2011). "The semantic Web gets down to business". computerworld.com. Rodriguez, Marko (2011). "Graph Pattern Matching with Gremlin". Marko A. Rodriguez. markorodriguez.com on Graph Computing. Sequeda, Juan (2011). "SPARQL Nuts & Bolts". Cambridge Semantics. Freitas, Andre (2012). "Querying Heterogeneous Datasets on the Linked Data Web" (PDF). IEEE Internet Computing. Kauppinen, Tomi (2012). "Using the SPARQL Package in R to handle Spatial Linked Data". linkedscience.org. Lorentz, Alissa (2013). "With Big Data, Context is a Big Issue". Wired.

    Read more →