AI Face Combiner

AI Face Combiner — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Kounta (software company)

    Kounta (software company)

    Kounta is an Australian software company founded in 2012. The company's flagship product, Kounta, comprises a cloud based point of sale mobile app. == History == Kounta was founded in 2012 by entrepreneur Nick Cloete. The company is headquartered in Sydney, Australia. In 2012, the company launched its flagship product, Kounta, a hospitality-focused point of sale (POS) mobile app for iPad, Android, Mac, and Windows. The app was initially a web-based application, and later developed into an online cash register and inventory management system that allows businesses to take payments from customers via mobile devices. The app has been made available for iPad, iPhone, and Android devices; as well as iOS, Windows, and other peripherals. In 2012, Kounta partnered with Epson, providing a cloud-based POS platform for Epson printers. In 2013, the company formed a partnership with PayPal, integrating cashless and cardless transaction options via PayPal's mobile app. In 2014, MYOB (company) made an undisclosed investment towards Kounta. This partnership led to the development of MYOB Kounta, a co-branded application merging Kounta's POS with MYOB's application software. MYOB Kounta launched in October of the same year. In 2016, Kounta announced a partnership with the Commonwealth Bank of Australia to include the Kounta app onto "Albert", the bank's EFTPOS tablet, which allowed the Commonwealth Bank of Australia to become the first bank to manage all customers operations from a single device and mobile application. == Technology == The Kounta POS is a software-as-a-service (SaaS) that runs as an application in web browsers as well as natively on iOS and Android operating systems. Kounta also incorporates an Open API, making it possible for other software providers to integrate complementary apps, further extending the software's use. Traditional IT tasks, such as data backup and encryption, hardware maintenance, and server upgrades are handled by Kounta's data center. Kounta is made accessible via paid monthly subscription licenses. == Acquisition by Lightspeed == In October 2019, Kounta was acquired by Lightspeed, an advanced commerce platform for retail, hospitality, and golf businesses based in Montreal, Canada. Lightspeed acquired Kounta for $35.3 million USD.

    Read more →
  • Quantexa

    Quantexa

    Quantexa is a UK-based software company that develops artificial intelligence-based applications for data analytics and decision-making. The company was founded in 2016 and is headquartered in London, with operations in North America, Europe, and the Asia-Pacific region. As of 2025, Quantexa reported a valuation of $2.6 billion and provides services to organizations in over 70 countries. Investors include Warburg Pincus, HSBC, and the Ontario Teachers’ Pension Plan. == History == Quantexa was founded in London in 2016 by several co-founders, including Jamie Hutton, Richard Seewald, Imam Hoque, Felix Hoddinott, and Vishal Marria, who also serves as the company's chief executive officer. The company was established to develop tools intended to address limitations in traditional data analysis methods, particularly those related to identifying hidden connections across large datasets. The name "Quantexa" is derived from the company's focus on quantitative methods and data analysis. In 2023, Quantexa acquired Dublin-based AI firm Aylien. In April 2023, the company completed a Series E funding round, raising $129 million at a valuation of approximately $1.8 billion, marking its entry into "unicorn" status. In October 2024, the company reported annual recurring revenue (ARR) exceeding $100 million. In early 2025, Quantexa participated in the World Economic Forum's Unicorn Program, which supports high-growth technology companies. In March 2025, Quantexa completed a Series F funding round of $175 million, led by Teachers' Venture Growth, the venture arm of the Ontario Teachers' Pension Plan. That August, the company was reported to be considering a 2026 IPO. The company formed a partnership with Zurich in October 2025, the first insurer to add its AI-based Decision Intelligence platform to enhance fraud detection.

    Read more →
  • Application enablement

    Application enablement

    Application enablement is an approach which brings telecommunications network providers and developers together to combine their network and web abilities in creating and delivering high demand advanced services and new intelligent applications. Network providers, in addition to bandwidth, provide abilities such as billing, location, presence, and security, which have allowed them to establish long-term relationships with end-users. By offering these select abilities as application programming interfaces (APIs), providers give developers access to a set of tools to create (mashup) new applications and services to run on provider networks. Unifying the strengths of providers and developers facilitates the creation of mash-up applications, and in turn, a better end user quality of experience (QoE) for improved profit margins. Apple's iOS with App Store, and Google's Android with Android Market exemplify this approach. Both have introduced mobile platforms that are supported by a comprehensive ecosystem in order to perpetuate innovation in product design, content and service offerings, and overall consumer behavior. By the end of April 2010, downloadable applications numbered over 200,000 for iPhone and over 50,000 for Android. == Background == Historically, telecommunication providers primarily based their business models on network performance, emphasizing connectivity, availability, and quality of service (QoS) as key sources of revenue and customer value. With the increasing demand for bandwidth-intensive data and video applications, maintaining service continuity has required substantial infrastructure investments. To address rising operational costs and declining average revenue per user (ARPU), providers have increasingly adopted customer-oriented strategies and diversified business models to expand their roles within the telecommunications value chain. Application enablement supports providers in making this transition by providing an environment, or ecosystem, where providers and developers can collaborate to build, test, manage, and distribute applications across networks including television, broadband, Internet, and mobile. This cooperative effort produces mutually beneficial results for all parties, opening up new revenue streams while enhancing value and rate of return (ROI). The following are some examples of key network abilities which function as application enablers in the telecommunications market: Billing systems Security for private transactions Network-based storage of digital content End-to-end bandwidth for high-quality transmissions Scoring abilities to identify end-user preferences and behaviors Subscriber data to customize the end-user experience Context information, such as location and presence, to localize services. == New business models == As network providers work toward effective collaboration with application and content developers, several new business models are emerging to help facilitate the business relationships: === Vendor-led === A type of business model driven by telecommunications vendors, who assist network providers in building relationships with application and content developers to lower the cost and complexity of managing third parties. Examples of this model include: Forum Nokia IBM Technology Partner Ecosystem Ng Connect Huawei Intouch program === Operator-led === Characterized by network providers who want to maintain a high degree of flexibility and control over applications created for their end-consumers, this model lets them create and manage their own developer program, development platform, and application store. Under this arrangement, independent developers provide their own branding, marketing communications, pricing and customer care. Network providers pursuing this model will often seek to partner with a large number of third parties using standardized on-boarding processes. Examples of this model include: o2 Litmus Orange Partner Joint Innovation Lab === Aggregator === Network providers who choose not to create/manage their own developer relationships will partner with one or multiple aggregators, to administer a portion of or their entire application strategy. Examples of this model include: Ovi Operator Partnership Blackberry Operator Partnership Cellmania Buongiorno === Mass wholesale === Select network providers also participate in wholesale models that exist primarily for applications (BT's Ribbit- an Internet Protocol (IP) based calling and messaging platform) and devices (Verizon's Open Device initiative). This business-to-business approach reduces a large portion of the potential costs of third party application enablement (marketing, acquisition and support). Examples of this model include: BT's Ribbit Verizon Wireless ODI AT&T Synaptic Hosting === The enterprise customer === Some network providers are focusing on enabling applications in the enterprise space. In this model, the network provider establishes a platform for their large enterprise customers who want to blend custom software with enhanced abilities, and will provide standardized processes around mobilizing enterprise applications, and exposing core back-office abilities to allow for dynamic customer interaction. Examples of this model include: Vodafone Applications Service Verizon Private Network Sprint Solution Launchpad === Trusted partner === In this model, the network provider builds one-on-one relationships with trusted third-party developers by exposing customized network abilities, bringing a greater variety of brands to the network provider's portfolio. Network providers using this model tend to only have a few partners (in contrast to the operator led model). Under this scenario, network providers benefit from a pre-established customer base and the developer's marketing resources. Examples of this model include: 3/Skype Partnership (UK) Virgin Media and BBC iPlayer == Network operator developer resources == Operator led model o2 Litmus Orange Partner Joint Innovations Lab Aggregator model Ovi Operator Partnership Cellmania Buongiorno Mass wholesale model BT Ribbit Verizon Wireless ODI AT&T Synaptic Hosting Enterprise customer model Vodafone Applications Service Verizon Private Network Sprint Solution Launchpad == Rerencesfe ==

    Read more →
  • Convolutional neural network

    Convolutional neural network

    A convolutional neural network (CNN) is a type of feedforward neural network that learns features via filter (or kernel) optimization. This type of deep learning network has been applied to process and make predictions from many different types of data including text, images and audio. CNNs are the de-facto standard in deep learning-based approaches to computer vision and image processing, and have only recently been replaced—in some cases—by newer architectures such as the transformer. Vanishing gradients and exploding gradients, seen during backpropagation in earlier neural networks, are prevented by the regularization that comes from using shared weights over fewer connections. For example, for each neuron in the fully-connected layer, 10,000 weights would be required for processing an image sized 100 × 100 pixels. However, applying cascaded convolution (or cross-correlation) kernels, only 25 weights for each convolutional layer are required to process 5x5-sized tiles. Higher-layer features are extracted from wider context windows, compared to lower-layer features. Some applications of CNNs include: image and video recognition, recommender systems, image classification, image segmentation, medical image analysis, natural language processing, brain–computer interfaces, and financial time series. CNNs are also known as shift invariant or space invariant artificial neural networks, based on the shared-weight architecture of the convolution kernels or filters that slide along input features and provide translation-equivariant responses known as feature maps. Counter-intuitively, most convolutional neural networks are not invariant to translation, due to the downsampling operation they apply to the input. Feedforward neural networks are usually fully connected networks, that is, each neuron in one layer is connected to all neurons in the next layer. The "full connectivity" of these networks makes them prone to overfitting data. Typical ways of regularization, or preventing overfitting, include: penalizing parameters during training (such as weight decay) or trimming connectivity (skipped connections, dropout, etc.) Robust datasets also increase the probability that CNNs will learn the generalized principles that characterize a given dataset rather than the biases of a poorly-populated set. Convolutional networks were inspired by biological processes in that the connectivity pattern between neurons resembles the organization of the animal visual cortex. Individual cortical neurons respond to stimuli only in a restricted region of the visual field known as the receptive field. The receptive fields of different neurons partially overlap such that they cover the entire visual field. CNNs use relatively little pre-processing compared to other image classification algorithms. This means that the network learns to optimize the filters (or kernels) through automated learning, whereas in traditional algorithms these filters are hand-engineered. This simplifies and automates the process, enhancing efficiency and scalability overcoming human-intervention bottlenecks. == Architecture == A convolutional neural network consists of an input layer, hidden layers and an output layer. In a convolutional neural network, the hidden layers include one or more layers that perform convolutions. Typically this includes a layer that performs a dot product of the convolution kernel with the layer's input matrix. This product is usually the Frobenius inner product, and its activation function is commonly ReLU. As the convolution kernel slides along the input matrix for the layer, the convolution operation generates a feature map, which in turn contributes to the input of the next layer. This is followed by other layers such as pooling layers, fully connected layers, and normalization layers. Here it should be noted how close a convolutional neural network is to a matched filter. === Convolutional layers === In a CNN, the input is a tensor with shape: (number of inputs) × (input height) × (input width) × (input channels) After passing through a convolutional layer, the image becomes abstracted to a feature map, also called an activation map, with shape: (number of inputs) × (feature map height) × (feature map width) × (feature map channels). Convolutional layers convolve the input and pass its result to the next layer. This is similar to the response of a neuron in the visual cortex to a specific stimulus. Each convolutional neuron processes data only for its receptive field. Although fully connected feedforward neural networks can be used to learn features and classify data, this architecture is generally impractical for larger inputs (e.g., high-resolution images), which would require massive numbers of neurons because each pixel is a relevant input feature. A fully connected layer for an image of size 100 × 100 has 10,000 weights for each neuron in the second layer. Convolution reduces the number of free parameters, allowing the network to be deeper. For example, using a 5 × 5 tiling region, each with the same shared weights, requires only 25 neurons. Using shared weights means there are many fewer parameters, which helps avoid the vanishing gradients and exploding gradients problems seen during backpropagation in earlier neural networks. To speed processing, standard convolutional layers can be replaced by depthwise separable convolutional layers, which are based on a depthwise convolution followed by a pointwise convolution. The depthwise convolution is a spatial convolution applied independently over each channel of the input tensor, while the pointwise convolution is a standard convolution restricted to the use of 1 × 1 {\displaystyle 1\times 1} kernels. === Pooling layers === Convolutional networks may include local and/or global pooling layers along with traditional convolutional layers. Pooling layers reduce the dimensions of data by combining the outputs of neuron clusters at one layer into a single neuron in the next layer. Local pooling combines small clusters, tiling sizes such as 2 × 2 are commonly used. Global pooling acts on all the neurons of the feature map. There are two common types of pooling in popular use: max and average. Max pooling uses the maximum value of each local cluster of neurons in the feature map, while average pooling takes the average value. === Fully connected layers === Fully connected layers connect every neuron in one layer to every neuron in another layer. It is the same as a traditional multilayer perceptron neural network (MLP). Each neuron in the fully connected layer receives input from all the neurons in the previous layer. These inputs are weighted and summed with the corresponding biases, and then passed through an activation function to perform a nonlinear transformation, generating the output. The flattened matrix goes through a fully connected layer to classify the images. === Receptive field === In neural networks, each neuron receives input from some number of locations in the previous layer. In a convolutional layer, each neuron receives input from only a restricted area of the previous layer called the neuron's receptive field. Typically the area is a square (e.g. 5 by 5 neurons). Whereas, in a fully connected layer, the receptive field is the entire previous layer. Thus, in each convolutional layer, each neuron takes input from a larger area in the input than previous layers. This is due to applying the convolution over and over, which takes the value of a pixel into account, as well as its surrounding pixels. When using dilated layers, the number of pixels in the receptive field remains constant, but the field is more sparsely populated as its dimensions grow when combining the effect of several layers. To manipulate the receptive field size as desired, there are some alternatives to the standard convolutional layer. For example, atrous or dilated convolution expands the receptive field size without increasing the number of parameters by interleaving visible and blind regions. Moreover, a single dilated convolutional layer can comprise filters with multiple dilation ratios, thus having a variable receptive field size. === Weights === Each neuron in a neural network computes an output value by applying a specific function to the input values received from the receptive field in the previous layer. The function that is applied to the input values is determined by a vector of weights and a bias (typically real numbers). Learning consists of iteratively adjusting these biases and weights. The vectors of weights and biases are called filters and represent particular features of the input (e.g., a particular shape). A distinguishing feature of CNNs is that many neurons can share the same filter. This reduces the memory footprint because a single bias and a single vector of weights are used across all receptive fields that share that filter, as opposed to each receptive field having its own bias and vector

    Read more →
  • Exposure Notification

    Exposure Notification

    The (Google/Apple) Exposure Notification System (GAEN) is a framework and protocol specification developed by Apple Inc. and Google to facilitate digital contact tracing during the COVID-19 pandemic. When used by health authorities, it augments more traditional contact tracing techniques by automatically logging close approaches among notification system users using Android or iOS smartphones. Exposure Notification is a decentralized reporting protocol built on a combination of Bluetooth Low Energy technology and privacy-preserving cryptography. It is an opt-in feature within COVID-19 apps developed and published by authorized health authorities. Unveiled on April 10, 2020, it was made available on iOS on May 20, 2020, as part of the iOS 13.5 update and on December 14, 2020, as part of the iOS 12.5 update for older iPhones. On Android, it was added to devices via a Google Play Services update, supporting all versions since Android Marshmallow. The Apple/Google protocol is similar to the Decentralized Privacy-Preserving Proximity Tracing (DP-3T) protocol created by the European DP-3T consortium and the Temporary Contact Number (TCN) protocol by Covid Watch, but is implemented at the operating system level, which allows for more efficient operation as a background process. Since May 2020, a variant of the DP-3T protocol is supported by the Exposure Notification Interface. Other protocols are constrained in operation because they are not privileged over normal apps. This leads to issues, particularly on iOS devices where digital contact tracing apps running in the background experience significantly degraded performance. The joint approach is also designed to maintain interoperability between Android and iOS devices, which constitute nearly all of the market. The ACLU stated the approach "appears to mitigate the worst privacy and centralization risks, but there is still room for improvement". In late April, Google and Apple shifted the emphasis of the naming of the system, describing it as an "exposure notification service", rather than "contact tracing" system. == Technical specification == Digital contact tracing protocols typically have two major responsibilities: encounter logging and infection reporting. Exposure Notification only involves encounter logging which is a decentralized architecture. The majority of infection reporting is centralized in individual app implementations. To handle encounter logging, the system uses Bluetooth Low Energy to send tracking messages to nearby devices running the protocol to discover encounters with other people. The tracking messages contain unique identifiers that are encrypted with a secret daily key held by the sending device. These identifiers change every 15–20 minutes as well as Bluetooth MAC address in order to prevent tracking of clients by malicious third parties through observing static identifiers over time. The sender's daily encryption keys are generated using a random number generator. Devices record received messages, retaining them locally for 14 days. If a user tests positive for infection, the last 14 days of their daily encryption keys can be uploaded to a central server, where it is then broadcast to all devices on the network. The method through which daily encryption keys are transmitted to the central server and broadcast is defined by individual app developers. The Google-developed reference implementation calls for a health official to request a one-time verification code (VC) from a verification server, which the user enters into the encounter logging app. This causes the app to obtain a cryptographically signed certificate, which is used to authorize the submission of keys to the central reporting server. The received keys are then provided to the protocol, where each client individually searches for matches in their local encounter history. If a match meeting certain risk parameters is found, the app notifies the user of potential exposure to the infection. Google and Apple intend to use the received signal strength (RSSI) of the beacon messages as a source to infer proximity. RSSI and other signal metadata will also be encrypted to resist deanonymization attacks. === Version 1.0 === To generate encounter identifiers, first a persistent 32-byte private Tracing Key ( t k {\displaystyle tk} ) is generated by a client. From this a 16 byte Daily Tracing Key is derived using the algorithm d t k i = H K D F ( t k , N U L L , 'CT-DTK' | | D i , 16 ) {\displaystyle dtk_{i}=HKDF(tk,NULL,{\text{'CT-DTK'}}||D_{i},16)} , where H K D F ( Key, Salt, Data, OutputLength ) {\displaystyle HKDF({\text{Key, Salt, Data, OutputLength}})} is a HKDF function using SHA-256, and D i {\displaystyle D_{i}} is the day number for the 24-hour window the broadcast is in starting from Unix Epoch Time. These generated keys are later sent to the central reporting server should a user become infected. From the daily tracing key a 16-byte temporary Rolling Proximity Identifier is generated every 10 minutes with the algorithm R P I i , j = Truncate ( H M A C ( d t k i , 'CT-RPI' | | T I N j ) , 16 ) {\displaystyle RPI_{i,j}={\text{Truncate}}(HMAC(dtk_{i},{\text{'CT-RPI'}}||TIN_{j}),16)} , where H M A C ( Key, Data ) {\displaystyle HMAC({\text{Key, Data}})} is a HMAC function using SHA-256, and T I N j {\displaystyle TIN_{j}} is the time interval number, representing a unique index for every 10 minute period in a 24-hour day. The Truncate function returns the first 16 bytes of the HMAC value. When two clients come within proximity of each other they exchange and locally store the current R P I i , j {\displaystyle RPI_{i,j}} as the encounter identifier. Once a registered health authority has confirmed the infection of a user, the user's Daily Tracing Key for the past 14 days is uploaded to the central reporting server. Clients then download this report and individually recalculate every Rolling Proximity Identifier used in the report period, matching it against the user's local encounter log. If a matching entry is found, then contact has been established and the app presents a notification to the user warning them of potential infection. === Version 1.1 === Unlike version 1.0 of the protocol, version 1.1 does not use a persistent tracing key, rather every day a new random 16-byte Temporary Exposure Key ( t e k i {\displaystyle tek_{i}} ) is generated. This is analogous to the daily tracing key from version 1.0. Here i {\displaystyle i} denotes the time is discretized in 10 minute intervals starting from Unix Epoch Time. From this two 128-bit keys are calculated, the Rolling Proximity Identifier Key ( R P I K i {\displaystyle RPIK_{i}} ) and the Associated Encrypted Metadata Key ( A E M K i {\displaystyle AEMK_{i}} ). R P I K i {\displaystyle RPIK_{i}} is calculated with the algorithm R P I K i = H K D F ( t e k i , N U L L , 'EN-RPIK' , 16 ) {\displaystyle RPIK_{i}=HKDF(tek_{i},NULL,{\text{'EN-RPIK'}},16)} , and A E M K i {\displaystyle AEMK_{i}} using the algorithm A E M K i = H K D F ( t e k i , N U L L , 'EN-AEMK' , 16 ) {\displaystyle AEMK_{i}=HKDF(tek_{i},NULL,{\text{'EN-AEMK'}},16)} . From these values a temporary Rolling Proximity Identifier ( R P I i , j {\displaystyle RPI_{i,j}} ) is generated every time the BLE MAC address changes, roughly every 15–20 minutes. The following algorithm is used: R P I i , j = A E S 128 ( R P I K i , 'EN-RPI' | | 0 x 000000000000 | | E N I N j ) {\displaystyle RPI_{i,j}=AES128(RPIK_{i},{\text{'EN-RPI'}}||{\mathtt {0x000000000000}}||ENIN_{j})} , where A E S 128 ( Key, Data ) {\displaystyle AES128({\text{Key, Data}})} is an AES cryptography function with a 128-bit key, the data is one 16-byte block, j {\displaystyle j} denotes the Unix Epoch Time at the moment the roll occurs, and E N I N j {\displaystyle ENIN_{j}} is the corresponding 10-minute interval number. Next, additional Associated Encrypted Metadata is encrypted. What the metadata represents is not specified, likely to allow the later expansion of the protocol. The following algorithm is used: Associated Encrypted Metadata i , j = A E S 128 _ C T R ( A E M K i , R P I i , j , Metadata ) {\displaystyle {\text{Associated Encrypted Metadata}}_{i,j}=AES128\_CTR(AEMK_{i},RPI_{i,j},{\text{Metadata}})} , where A E S 128 _ C T R ( Key, IV, Data ) {\displaystyle AES128\_CTR({\text{Key, IV, Data}})} denotes AES encryption with a 128-bit key in CTR mode. The Rolling Proximity Identifier and the Associated Encrypted Metadata are then combined and broadcast using BLE. Clients exchange and log these payloads. Once a registered health authority has confirmed the infection of a user, the user's Temporary Exposure Keys t e k i {\displaystyle tek_{i}} and their respective interval numbers i {\displaystyle i} for the past 14 days are uploaded to the central reporting server. Clients then download this report and individually recalculate every Rolling Proximity Identifier starting from interval number i {\displaystyle i} ,

    Read more →
  • Ayoba

    Ayoba

    Ayoba is an African communication platform developed in South Africa. It is owned by Progressive Tech Holdings in Mauritius and managed by SIMFY Africa. Launched on May 4, 2019, as of April 2024, it has over 35 million active users. == History == Ayoba was first published on Google Play in February 2019. Its first marketing campaign and brand launch took place in Cameroon on May 4, 2019. In June 2019, the platform introduced its first eight channels. In November 2019, the platform reached one million active users, which increased to two million by June 2020. Subsequently, ayoba expanded its services, including the launch of games for Android in February 2020, Momo (Mobile Money) in Cameroon in May 2020, and MicroApps in May 2020. It also launched music and voice and video calling features in 12 territories in August 2020. The first version of ayoba for iOS was released in September 2020. In December of the same year, games and Messaging 2.0 were launched on the platform. In November 2020, it won Best Mobile Application at the African Digital Awards. In 2021, it won OTT Brand of the Year at the Marketing World Awards in Ghana. In December 2022, it received Top Innovative Technology and Telecom Product of the Year at the National Communications Awards in December 2022. In June 2023 ayoba partnered with BoomPlay and as of April 2024, it had 35 million monthly active users. Ayoba has partnered with Jumia Ghana to offer exclusive deals to users. Ayoba users can get a 10% discount on selected Jumia purchases through the app, with no data charges for MTN users. This partnership aims to make online shopping more affordable and accessible by integrating Jumia's offers into the ayoba app. Ayoba supports over 35 million users across Africa and provides services in 22 languages. To access the deals, users can download the ayoba app from the Google Play Store, iOS Store, or the official website. == Platform features == Chat, Call and Share: ayoba enables instant messaging, voice notes, picture sharing, and file sharing with contacts, even if they do not have the app installed. The app supports voice and video calls on both Android and iOS, as well as group chats, help channel and SMS continuity (non ayoba users receive messages as SMS, their responses appear in the ayoba app). Music: ayoba offers a free music player with daily updates on international and African music. Users can find playlists for different genres. Games: ayoba provides a selection of interactive games, including action, adventure, and children's games available on both Android and iOS. Mobile Money Transfers: In certain territories, ayoba supports mobile money transfers using MTN Mobile Money (MoMo) for transactions within the app. MicroApps: ayoba features individual MicroApps within the platform that offer content and services, including streaming channels, podcasts, and specialized apps. The availability of these apps may vary by country. == Operations == ayoba primarily focuses on the following territories: Nigeria, Cameroon, South Africa, Ghana, Côte d'Ivoire, Uganda, Republic of Congo, Benin, Zambia, Tanzania, Kenya, Senegal, Togo, Guinea Bissau, Guinea Conakry, Sudan, South Sudan, and Liberia. The company operates from its offices in Cape Town and Johannesburg, South Africa. David Gillaranz served as the CEO from 2019 to 2021, and Burak Akinci has been the CEO since 2021.

    Read more →
  • Agent Ruby

    Agent Ruby

    Agent Ruby (1998–2002) by Lynn Hershman Leeson is an interactive, multiuser work using artificial intelligence. == Description == On Agent Ruby's website, "Agent Ruby's Edream Portal," a female face moves her eyes and lips. Ruby, named from Hershman Leeson's own film, Teknolust, answers questions and often responds that she needs a better algorithm to answer questions not within her database. The work, created with AI, explores relationships between real and virtual worlds. Hershman Leeson had created an earlier version of Ruby, CyberRoberta, which was a custom-made doll with webcam eyes that interacted with the internet. The work in a gallery provides a screen and a sign inviting gallery-goers to "Chat with Ruby." == Artificial intelligence == In 2015 when Agent Ruby was exhibited at the gallery Modern Art Oxford, a review in Aesthetica Magazine described it as an artificial intelligence agent. A review in New Scientist noted that "Ruby is a fast learner, but perhaps not a natural conversationalist." A 2024 list of "25 Essential AI Artworks" published by ARTnews wrote that while "Agent Ruby's capabilities seem limited by today's standards," it was extensive for its day. == Publications and exhibitions == Agent Ruby was commissioned and displayed at the San Francisco Museum of Modern Art, Modern Art Oxford, and the ZKM Center for Art and Media in Karlsruhe, Germany. The San Francisco Museum of Modern Art (SFMOMA) presented Lynn Hershman Leeson: The Agent Ruby Files, March 30 through June 2, 2013 which presented the project server's archive of user conversations over the 12 years of exhibitions.

    Read more →
  • Neighborhood operation

    Neighborhood operation

    In computer vision and image processing a neighborhood operation is a commonly used class of computations on image data which implies that it is processed according to the following pseudo code: Visit each point p in the image data and do { N = a neighborhood or region of the image data around the point p result(p) = f(N) } This general procedure can be applied to image data of arbitrary dimensionality. Also, the image data on which the operation is applied does not have to be defined in terms of intensity or color, it can be any type of information which is organized as a function of spatial (and possibly temporal) variables in p. The result of applying a neighborhood operation on an image is again something which can be interpreted as an image, it has the same dimension as the original data. The value at each image point, however, does not have to be directly related to intensity or color. Instead it is an element in the range of the function f, which can be of arbitrary type. Normally the neighborhood N is of fixed size and is a square (or a cube, depending on the dimensionality of the image data) centered on the point p. Also the function f is fixed, but may in some cases have parameters which can vary with p, see below. In the simplest case, the neighborhood N may be only a single point. This type of operation is often referred to as a point-wise operation. == Examples == The most common examples of a neighborhood operation use a fixed function f which in addition is linear, that is, the computation consists of a linear shift invariant operation. In this case, the neighborhood operation corresponds to the convolution operation. A typical example is convolution with a low-pass filter, where the result can be interpreted in terms of local averages of the image data around each image point. Other examples are computation of local derivatives of the image data. It is also rather common to use a fixed but non-linear function f. This includes median filtering, and computation of local variances. The Nagao-Matsuyama filter is an example of a complex local neighbourhood operation that uses variance as an indicator of the uniformity within a pixel group. The result is similar to a convolution with a low-pass filter with the added effect of preserving sharp edges. There is also a class of neighborhood operations in which the function f has additional parameters which can vary with p: Visit each point p in the image data and do { N = a neighborhood or region of the image data around the point p result(p) = f(N, parameters(p)) } This implies that the result is not shift invariant. Examples are adaptive Wiener filters. == Implementation aspects == The pseudo code given above suggests that a neighborhood operation is implemented in terms of an outer loop over all image points. However, since the results are independent, the image points can be visited in arbitrary order, or can even be processed in parallel. Furthermore, in the case of linear shift-invariant operations, the computation of f at each point implies a summation of products between the image data and the filter coefficients. The implementation of this neighborhood operation can then be made by having the summation loop outside the loop over all image points. An important issue related to neighborhood operation is how to deal with the fact that the neighborhood N becomes more or less undefined for points p close to the edge or border of the image data. Several strategies have been proposed: Compute result only for points p for which the corresponding neighborhood is well-defined. This implies that the output image will be somewhat smaller than the input image. Zero padding: Extend the input image sufficiently by adding extra points outside the original image which are set to zero. The loops over the image points described above visit only the original image points. Border extension: Extend the input image sufficiently by adding extra points outside the original image which are set to the image value at the closest image point. The loops over the image points described above visit only the original image points. Mirror extension: Extend the image sufficiently much by mirroring the image at the image boundaries. This method is less sensitive to local variations at the image boundary than border extension. Wrapping: The image is tiled, so that going off one edge wraps around to the opposite side of the image. This method assumes that the image is largely homogeneous, for example a stochastic image texture without large textons.

    Read more →
  • Vulnerability Discovery Model

    Vulnerability Discovery Model

    A Vulnerability Discovery Model (VDM) uses discovery event data with software reliability models for predicting the same. A thorough presentation of VDM techniques is available in. Numerous model implementations are available in the MCMCBayes open source repository. Several VDM examples include: Alhazmi-Malaiya: Time based model (Alhazmi-Malaiya Logistic (AML) model) Alhazmi-Malaiya: Effort based model Rescorla: Quadratic Model and Exponential Model Anderson: Thermodynamic Model Kim: Weibull Model Linear Model Hump-Shaped Model Independent and Dependent Model Vulnerability Discovery Modeling using Bayesian model averaging Multivariate Vulnerability Discovery Models

    Read more →
  • Hybrid machine translation

    Hybrid machine translation

    Hybrid machine translation is a method of machine translation that is characterized by the use of multiple machine translation approaches within a single machine translation system. The motivation for developing hybrid machine translation systems stems from the failure of any single technique to achieve a satisfactory level of accuracy. Many hybrid machine translation systems have been successful in improving the accuracy of the translations, and there are several popular machine translation systems which employ hybrid methods. == Approaches == === Multi-engine === This approach to hybrid machine translation involves running multiple machine translation systems in parallel. The final output is generated by combining the output of all the sub-systems. Most commonly, these systems use statistical and rule-based translation subsystems, but other combinations have been explored. For example, researchers at Carnegie Mellon University have had some success combining example-based, transfer-based, knowledge-based and statistical translation sub-systems into one machine translation system. === Statistical rule generation === This approach involves using statistical data to generate lexical and syntactic rules. The input is then processed with these rules as if it were a rule-based translator. This approach attempts to avoid the difficult and time-consuming task of creating a set of comprehensive, fine-grained linguistic rules by extracting those rules from the training corpus. This approach still suffers from many problems of normal statistical machine translation, namely that the accuracy of the translation will depend heavily on the similarity of the input text to the text of the training corpus. As a result, this technique has had the most success in domain-specific applications, and has the same difficulties with domain adaptation as many statistical machine translation systems. === Multi-Pass === This approach involves serially processing the input multiple times. The most common technique used in multi-pass machine translation systems is to pre-process the input with a rule-based machine translation system. The output of the rule-based pre-processor is passed to a statistical machine translation system, which produces the final output. This technique is used to limit the amount of information a statistical system need consider, significantly reducing the processing power required. It also removes the need for the rule-based system to be a complete translation system for the language, significantly reducing the amount of human effort and labor necessary to build the system. === Confidence-Based === This approach differs from the other hybrid approaches in that in most cases only one translation technology is used. A confidence metric is produced for each translated sentence from which a decision can be made whether to try a secondary translation technology or to proceed with the initial translation output. SMT is also used when common error patterns such as multiple repeat words appear in sequence, as is common with NMT when the attention mechanism is confused.

    Read more →
  • Tango (platform)

    Tango (platform)

    Tango (named Project Tango while in testing) was an augmented reality computing platform, developed and authored by the Advanced Technology and Projects (ATAP), a skunkworks division of Google. It used computer vision to enable mobile devices, such as smartphones and tablets, to detect their position relative to the world around them without using GPS or other external signals. This allowed application developers to create user experiences that include indoor navigation, 3D mapping, physical space measurement, environmental recognition, augmented reality, and windows into a virtual world. The first product to emerge from ATAP, Tango was developed by a team led by computer scientist Johnny Lee, a core contributor to Microsoft's Kinect. In an interview in June 2015, Lee said, "We're developing the hardware and software technologies to help everything and everyone understand precisely where they are, anywhere." Google produced two devices to demonstrate the Tango technology: the Peanut phone and the Yellowstone 7-inch tablet. More than 3,000 of these devices had been sold as of June 2015, chiefly to researchers and software developers interested in building applications for the platform. In the summer of 2015, Qualcomm and Intel both announced that they were developing Tango reference devices as models for device manufacturers who use their mobile chipsets. At CES, in January 2016, Google announced a partnership with Lenovo to release a consumer smartphone during the summer of 2016 to feature Tango technology marketed at consumers, noting a less than $500 price-point and a small form factor below 6.5 inches. At the same time, both companies also announced an application incubator to get applications developed to be on the device on launch. On 15 December 2017, Google announced that they would be ending support for Tango on March 1, 2018, in favor of ARCore. == Overview == Tango was different from other contemporary 3D-sensing computer vision products, in that it was designed to run on a standalone mobile phone or tablet and was chiefly concerned with determining the device's position and orientation within the environment. The software worked by integrating three types of functionality: Motion-tracking: using visual features of the environment, in combination with accelerometer and gyroscope data, to closely track the device's movements in space Area learning: storing environment data in a map that can be re-used later, shared with other Tango devices, and enhanced with metadata such as notes, instructions, or points of interest Depth perception: detecting distances, sizes, and surfaces in the environment Together, these generate data about the device in "six degrees of freedom" (3 axes of orientation plus 3 axes of position) and detailed three-dimensional information about the environment. Project Tango was also the first project to graduate from Google X in 2012 Applications on mobile devices use Tango's C and Java APIs to access this data in real time. In addition, an API was also provided for integrating Tango with the Unity game engine; this enabled the conversion or creation of games that allow the user to interact and navigate in the game space by moving and rotating a Tango device in real space. These APIs were documented on the Google developer website. == Applications == Tango enabled apps to track a device's position and orientation within a detailed 3D environment, and to recognize known environments. This allowed the creations of applications such as in-store navigation, visual measurement and mapping utilities, presentation and design tools, and a variety of immersive games. At Augmented World Expo 2015, Johnny Lee demonstrated a construction game that builds a virtual structure in real space, an AR showroom app that allows users to view a full-size virtual automobile and customize its features, a hybrid Nerf gun with mounted Tango screen for dodging and shooting AR monsters superimposed on reality, and a multiplayer VR app that lets multiple players converse in a virtual space where their avatar movements match their real-life movements. Tango apps are distributed through Play. Google has encouraged the development of more apps with hackathons, an app contest, and promotional discounts on the development tablet. == Devices == As a platform for software developers and a model for device manufacturers, Google created two Tango devices. === The Peanut phone === "Peanut" was the first production Tango device, released in the first quarter of 2014. It was a small Android phone with a Qualcomm MSM8974 quad-core processor and additional special hardware including a fisheye motion camera, "RGB-IR" camera for color image and infrared depth detection, and Movidius Vision processing units. A high-performance accelerometer and gyroscope were added after testing several competing models in the MARS lab at the University of Minnesota. Several hundred Peanut devices were distributed to early-access partners including university researchers in computer vision and robotics, as well as application developers and technology startups. Google stopped supporting the Peanut device in September 2015, as by then the Tango software stack had evolved beyond the versions of Android that run on the device. === The Yellowstone tablet === "Yellowstone" was a 7-inch tablet with full Tango functionality, released in June 2014, and sold as the Project Tango Tablet Development Kit. It featured a 2.3 GHz quad-core Nvidia Tegra K1 processor, 128GB flash memory, 1920x1200-pixel touchscreen, 4MP color camera, fisheye-lens (motion-tracking) camera, an IR projector with RGB-IR camera for integrated depth sensing, and 4G LTE connectivity. As of May 27, 2017, the Tango tablet is considered officially unsupported by Google. ==== Testing by NASA ==== In May 2014, two Peanut phones were delivered to the International Space Station to be part of a NASA project to develop autonomous robots that navigate in a variety of environments, including outer space. The soccer-ball-sized, 18-sided polyhedral SPHERES robots were developed at the NASA Ames Research Center, adjacent to the Google campus in Mountain View, California. Andres Martinez, SPHERES manager at NASA, said "We are researching how effective [Tango's] vision-based navigation abilities are for performing localization and navigation of a mobile free flyer on ISS. === Intel RealSense smartphone === Announced at Intel's Developer Forum in August 2015, and offered to public through a Developer Kit since January 2016. It incorporated a RealSense ZR300 camera which had optical features required for Tango, such as the fisheye camera. === Lenovo Phab 2 Pro === Lenovo Phab 2 Pro was the first commercial smartphone with the Tango Technology, the device was announced at the beginning of 2016, launched in August, and available for purchase in the US in November. The Phab 2 Pro had a 6.4 inch screen, a Snapdragon 652 processor, and 64 GB of internal storage, with a rear facing 16 Megapixels camera and 8 MP front camera. === Asus Zenfone AR === Asus Zenfone AR, announced at CES 2017, was the second commercial smartphone with the Tango Technology. It ran Tango AR & Daydream VR on Snapdragon 821, with 6GB or 8GB of RAM and 128 or 256GB of internal memory depending on the configuration.

    Read more →
  • Doubao

    Doubao

    Doubao (Chinese: 豆包) is an artificial intelligence assistant developed by ByteDance. == History == The chatbot was launched in August 2023. By November 2024, it had become China's most popular AI chatbot, with approximately 60 million monthly active users according to industry analytics. == Design == Doubao is powered by Volcano Engine (Volcengine), 120 trillion tokens consumed per day. == Variants == === Dola === The international version of Doubao is Dola which was launched in August 2023 as Cici. Dola is powered by OpenAI's GPT series of large language models and by Google's Gemini.

    Read more →
  • Tango (platform)

    Tango (platform)

    Tango (named Project Tango while in testing) was an augmented reality computing platform, developed and authored by the Advanced Technology and Projects (ATAP), a skunkworks division of Google. It used computer vision to enable mobile devices, such as smartphones and tablets, to detect their position relative to the world around them without using GPS or other external signals. This allowed application developers to create user experiences that include indoor navigation, 3D mapping, physical space measurement, environmental recognition, augmented reality, and windows into a virtual world. The first product to emerge from ATAP, Tango was developed by a team led by computer scientist Johnny Lee, a core contributor to Microsoft's Kinect. In an interview in June 2015, Lee said, "We're developing the hardware and software technologies to help everything and everyone understand precisely where they are, anywhere." Google produced two devices to demonstrate the Tango technology: the Peanut phone and the Yellowstone 7-inch tablet. More than 3,000 of these devices had been sold as of June 2015, chiefly to researchers and software developers interested in building applications for the platform. In the summer of 2015, Qualcomm and Intel both announced that they were developing Tango reference devices as models for device manufacturers who use their mobile chipsets. At CES, in January 2016, Google announced a partnership with Lenovo to release a consumer smartphone during the summer of 2016 to feature Tango technology marketed at consumers, noting a less than $500 price-point and a small form factor below 6.5 inches. At the same time, both companies also announced an application incubator to get applications developed to be on the device on launch. On 15 December 2017, Google announced that they would be ending support for Tango on March 1, 2018, in favor of ARCore. == Overview == Tango was different from other contemporary 3D-sensing computer vision products, in that it was designed to run on a standalone mobile phone or tablet and was chiefly concerned with determining the device's position and orientation within the environment. The software worked by integrating three types of functionality: Motion-tracking: using visual features of the environment, in combination with accelerometer and gyroscope data, to closely track the device's movements in space Area learning: storing environment data in a map that can be re-used later, shared with other Tango devices, and enhanced with metadata such as notes, instructions, or points of interest Depth perception: detecting distances, sizes, and surfaces in the environment Together, these generate data about the device in "six degrees of freedom" (3 axes of orientation plus 3 axes of position) and detailed three-dimensional information about the environment. Project Tango was also the first project to graduate from Google X in 2012 Applications on mobile devices use Tango's C and Java APIs to access this data in real time. In addition, an API was also provided for integrating Tango with the Unity game engine; this enabled the conversion or creation of games that allow the user to interact and navigate in the game space by moving and rotating a Tango device in real space. These APIs were documented on the Google developer website. == Applications == Tango enabled apps to track a device's position and orientation within a detailed 3D environment, and to recognize known environments. This allowed the creations of applications such as in-store navigation, visual measurement and mapping utilities, presentation and design tools, and a variety of immersive games. At Augmented World Expo 2015, Johnny Lee demonstrated a construction game that builds a virtual structure in real space, an AR showroom app that allows users to view a full-size virtual automobile and customize its features, a hybrid Nerf gun with mounted Tango screen for dodging and shooting AR monsters superimposed on reality, and a multiplayer VR app that lets multiple players converse in a virtual space where their avatar movements match their real-life movements. Tango apps are distributed through Play. Google has encouraged the development of more apps with hackathons, an app contest, and promotional discounts on the development tablet. == Devices == As a platform for software developers and a model for device manufacturers, Google created two Tango devices. === The Peanut phone === "Peanut" was the first production Tango device, released in the first quarter of 2014. It was a small Android phone with a Qualcomm MSM8974 quad-core processor and additional special hardware including a fisheye motion camera, "RGB-IR" camera for color image and infrared depth detection, and Movidius Vision processing units. A high-performance accelerometer and gyroscope were added after testing several competing models in the MARS lab at the University of Minnesota. Several hundred Peanut devices were distributed to early-access partners including university researchers in computer vision and robotics, as well as application developers and technology startups. Google stopped supporting the Peanut device in September 2015, as by then the Tango software stack had evolved beyond the versions of Android that run on the device. === The Yellowstone tablet === "Yellowstone" was a 7-inch tablet with full Tango functionality, released in June 2014, and sold as the Project Tango Tablet Development Kit. It featured a 2.3 GHz quad-core Nvidia Tegra K1 processor, 128GB flash memory, 1920x1200-pixel touchscreen, 4MP color camera, fisheye-lens (motion-tracking) camera, an IR projector with RGB-IR camera for integrated depth sensing, and 4G LTE connectivity. As of May 27, 2017, the Tango tablet is considered officially unsupported by Google. ==== Testing by NASA ==== In May 2014, two Peanut phones were delivered to the International Space Station to be part of a NASA project to develop autonomous robots that navigate in a variety of environments, including outer space. The soccer-ball-sized, 18-sided polyhedral SPHERES robots were developed at the NASA Ames Research Center, adjacent to the Google campus in Mountain View, California. Andres Martinez, SPHERES manager at NASA, said "We are researching how effective [Tango's] vision-based navigation abilities are for performing localization and navigation of a mobile free flyer on ISS. === Intel RealSense smartphone === Announced at Intel's Developer Forum in August 2015, and offered to public through a Developer Kit since January 2016. It incorporated a RealSense ZR300 camera which had optical features required for Tango, such as the fisheye camera. === Lenovo Phab 2 Pro === Lenovo Phab 2 Pro was the first commercial smartphone with the Tango Technology, the device was announced at the beginning of 2016, launched in August, and available for purchase in the US in November. The Phab 2 Pro had a 6.4 inch screen, a Snapdragon 652 processor, and 64 GB of internal storage, with a rear facing 16 Megapixels camera and 8 MP front camera. === Asus Zenfone AR === Asus Zenfone AR, announced at CES 2017, was the second commercial smartphone with the Tango Technology. It ran Tango AR & Daydream VR on Snapdragon 821, with 6GB or 8GB of RAM and 128 or 256GB of internal memory depending on the configuration.

    Read more →
  • Multi-agent reinforcement learning

    Multi-agent reinforcement learning

    Multi-agent reinforcement learning (MARL) is a sub-field of reinforcement learning. It focuses on studying the behavior of multiple learning agents that coexist in a shared environment. Each agent is motivated by its own rewards, and does actions to advance its own interests; in some environments these interests are opposed to the interests of other agents, resulting in complex group dynamics. Multi-agent reinforcement learning is closely related to game theory and especially repeated games, as well as multi-agent systems. Its study combines the pursuit of finding ideal algorithms that maximize rewards with a more sociological set of concepts. While research in single-agent reinforcement learning is concerned with finding the algorithm that gets the biggest number of points for one agent, research in multi-agent reinforcement learning evaluates and quantifies social metrics, such as cooperation, reciprocity, equity, social influence, language and discrimination. == Definition == Similarly to single-agent reinforcement learning, multi-agent reinforcement learning is modeled as some form of a Markov decision process (MDP). Fix a set of agents I = { 1 , . . . , N } {\displaystyle I=\{1,...,N\}} . We then define: A set S {\displaystyle S} of environment states. One set A i {\displaystyle {\mathcal {A}}_{i}} of actions for each of the agents i ∈ I = { 1 , … , N } {\displaystyle i\in I=\{1,\dots ,N\}} . P a → ( s , s ′ ) = Pr ( s t + 1 = s ′ ∣ s t = s , a → t = a → ) {\displaystyle P_{\vec {a}}(s,s')=\Pr(s_{t+1}=s'\mid s_{t}=s,{\vec {a}}_{t}={\vec {a}})} is the probability of transition (at time t {\displaystyle t} ) from state s {\displaystyle s} to state s ′ {\displaystyle s'} under joint action a → {\displaystyle {\vec {a}}} . R → a → ( s , s ′ ) {\displaystyle {\vec {R}}_{\vec {a}}(s,s')} is the immediate joint reward after the transition from s {\displaystyle s} to s ′ {\displaystyle s'} with joint action a → {\displaystyle {\vec {a}}} . In settings with perfect information, such as the games of chess and Go, the MDP would be fully observable. In settings with imperfect information, especially in real-world applications like self-driving cars, each agent would access an observation that only has part of the information about the current state. In the partially observable setting, the core model is the partially observable stochastic game in the general case, and the decentralized POMDP in the cooperative case. == Cooperation vs. competition == When multiple agents are acting in a shared environment their interests might be aligned or misaligned. MARL allows exploring all the different alignments and how they affect the agents' behavior: In pure competition settings, the agents' rewards are exactly opposite to each other, and therefore they are playing against each other. Pure cooperation settings are the other extreme, in which agents get the exact same rewards, and therefore they are playing with each other. Mixed-sum settings cover all the games that combine elements of both cooperation and competition. === Pure competition settings === When two agents are playing a zero-sum game, they are in pure competition with each other. Many traditional games such as chess and Go fall under this category, as do two-player variants of video games like StarCraft. Because each agent can only win at the expense of the other agent, many complexities are stripped away. There is no prospect of communication or social dilemmas, as neither agent is incentivized to take actions that benefit its opponent. The Deep Blue and AlphaGo projects demonstrate how to optimize the performance of agents in pure competition settings. One complexity that is not stripped away in pure competition settings is autocurricula. As the agents' policy is improved using self-play, multiple layers of learning may occur. === Pure cooperation settings === MARL is used to explore how separate agents with identical interests can communicate and work together. Pure cooperation settings are explored in recreational cooperative games such as Overcooked, as well as real-world scenarios in robotics. In pure cooperation settings all the agents get identical rewards, which means that social dilemmas do not occur. In pure cooperation settings, oftentimes there are an arbitrary number of coordination strategies, and agents converge to specific "conventions" when coordinating with each other. The notion of conventions has been studied in language and also alluded to in more general multi-agent collaborative tasks. === Mixed-sum settings === Most real-world scenarios involving multiple agents have elements of both cooperation and competition. For example, when multiple self-driving cars are planning their respective paths, each of them has interests that are diverging but not exclusive: Each car is minimizing the amount of time it's taking to reach its destination, but all cars have the shared interest of avoiding a traffic collision. Zero-sum settings with three or more agents often exhibit similar properties to mixed-sum settings, since each pair of agents might have a non-zero utility sum between them. Mixed-sum settings can be explored using classic matrix games such as prisoner's dilemma, more complex sequential social dilemmas, and recreational games such as Among Us, Diplomacy and StarCraft II. Mixed-sum settings can give rise to communication and social dilemmas. == Social dilemmas == As in game theory, much of the research in MARL revolves around social dilemmas, such as prisoner's dilemma, chicken and stag hunt. While game theory research might focus on Nash equilibria and what an ideal policy for an agent would be, MARL research focuses on how the agents would learn these ideal policies using a trial-and-error process. The reinforcement learning algorithms that are used to train the agents are maximizing the agent's own reward; the conflict between the needs of the agents and the needs of the group is a subject of active research. Various techniques have been explored in order to induce cooperation in agents: Modifying the environment rules, adding intrinsic rewards, and more. === Sequential social dilemmas === Social dilemmas like prisoner's dilemma, chicken and stag hunt are "matrix games". Each agent takes only one action from a choice of two possible actions, and a simple 2x2 matrix is used to describe the reward that each agent will get, given the actions that each agent took. In humans and other living creatures, social dilemmas tend to be more complex. Agents take multiple actions over time, and the distinction between cooperating and defecting is not as clear cut as in matrix games. The concept of a sequential social dilemma (SSD) was introduced in 2017 as an attempt to model that complexity. There is ongoing research into defining different kinds of SSDs and showing cooperative behavior in the agents that act in them. == Autocurricula == An autocurriculum (plural: autocurricula) is a reinforcement learning concept that's salient in multi-agent experiments. As agents improve their performance, they change their environment; this change in the environment affects themselves and the other agents. The feedback loop results in several distinct phases of learning, each depending on the previous one. The stacked layers of learning are called an autocurriculum. Autocurricula are especially apparent in adversarial settings, where each group of agents is racing to counter the current strategy of the opposing group. The Hide and Seek game is an accessible example of an autocurriculum occurring in an adversarial setting. In this experiment, a team of seekers is competing against a team of hiders. Whenever one of the teams learns a new strategy, the opposing team adapts its strategy to give the best possible counter. When the hiders learn to use boxes to build a shelter, the seekers respond by learning to use a ramp to break into that shelter. The hiders respond by locking the ramps, making them unavailable for the seekers to use. The seekers then respond by "box surfing", exploiting a glitch in the game to penetrate the shelter. Each "level" of learning is an emergent phenomenon, with the previous level as its premise. This results in a stack of behaviors, each dependent on its predecessor. Autocurricula in reinforcement learning experiments are compared to the stages of the evolution of life on Earth and the development of human culture. A major stage in evolution happened 2-3 billion years ago, when photosynthesizing life forms started to produce massive amounts of oxygen, changing the balance of gases in the atmosphere. In the next stages of evolution, oxygen-breathing life forms evolved, eventually leading up to land mammals and human beings. These later stages could only happen after the photosynthesis stage made oxygen widely available. Similarly, human culture could not have gone through the Industrial Revolution in the 18th century without the resources and insights gaine

    Read more →
  • Text normalization

    Text normalization

    Text normalization is the process of transforming text into a single canonical form that it might not have had before. Normalizing text before storing or processing it allows for separation of concerns, since input is guaranteed to be consistent before operations are performed on it. Text normalization requires being aware of what type of text is to be normalized and how it is to be processed afterwards; there is no all-purpose normalization procedure. == Applications == Text normalization is frequently used when converting text to speech. Numbers, dates, acronyms, and abbreviations are non-standard "words" that need to be pronounced differently depending on context. For example: "$200" would be pronounced as "two hundred dollars" in English, but as "lua selau tālā" in Samoan. "vi" could be pronounced as "vie," "vee," or "the sixth" depending on the surrounding words. Text can also be normalized for storing and searching in a database. For instance, if a search for "resume" is to match the word "résumé," then the text would be normalized by removing diacritical marks; and if "john" is to match "John", the text would be converted to a single case. To prepare text for searching, it might also be stemmed (e.g. converting "flew" and "flying" both into "fly"), canonicalized (e.g. consistently using American or British English spelling), or have stop words removed. == Techniques == For simple, context-independent normalization, such as removing non-alphanumeric characters or diacritical marks, regular expressions would suffice. For example, the sed script sed ‑e "s/\s+/ /g" inputfile would normalize runs of whitespace characters into a single space. More complex normalization requires correspondingly complicated algorithms, including domain knowledge of the language and vocabulary being normalized. Among other approaches, text normalization has been modeled as a problem of tokenizing and tagging streams of text and as a special case of machine translation. == Textual scholarship == In the field of textual scholarship and the editing of historic texts, the term "normalization" implies a degree of modernization and standardization – for example in the extension of scribal abbreviations and the transliteration of the archaic glyphs typically found in manuscript and early printed sources. A normalized edition is therefore distinguished from a diplomatic edition (or semi-diplomatic edition), in which some attempt is made to preserve these features. The aim is to strike an appropriate balance between, on the one hand, rigorous fidelity to the source text (including, for example, the preservation of enigmatic and ambiguous elements); and, on the other, producing a new text that will be comprehensible and accessible to the modern reader. The extent of normalization is therefore at the discretion of the editor, and will vary. Some editors, for example, choose to modernize archaic spellings and punctuation, but others do not. An edition of a text might be normalized based on internal criteria, where orthography is standardized according to the language of the original, or external criteria, where the norms of a different time period are applied. For an example of the latter, a published edition of a medieval Icelandic manuscript might be normalized to the conventions of modern Icelandic, or it might be normalized to Classical Old Icelandic. Standards of normalization vary based on language of the edition as well as the specific conventions of the publisher.

    Read more →