AI Art Can't Be Copyrighted

AI Art Can't Be Copyrighted — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

LakeFS

lakeFS is an open-source data version control system for managing data stored in object storage. It provides Git-like operations such as branching, committing, merging, and reverting for large-scale data stored in systems including Amazon S3, Azure Blob Storage, and Google Cloud Storage, as well as other S3-compatible object storage platforms. lakeFS is used in data engineering and machine learning workflows to manage changes to data, support reproducibility, and enable data governance across data lakes. The software is available as an open-source project, as well as in enterprise and managed service offerings, including lakeFS Cloud. == History == lakeFS was created in 2020 by Einat Orr and Oz Katz at Treeverse. Its first public release, version 0.8.1, appeared in August 2020 and introduced Git-style operations with support for Amazon S3. In 2021, Treeverse raised $23 million in a Series A funding round led by Dell Technologies Capital, Norwest Venture Partners, and Zeev Ventures. The same year, lakeFS was included in InfoWorld’s Best of Open Source Software (Bossie) awards. In June 2022, Treeverse introduced lakeFS Cloud, a managed service providing hosted lakeFS deployments for cloud-based data lakes. Version 1.0 was released in October 2023, adding integrations with platforms such as Databricks and Apache Iceberg, as well as support for orchestration tools including Apache Airflow. Public case studies and conference materials have described usage of lakeFS by organizations such as Microsoft, Volvo, and NASA. In July 2025, Treeverse announced an additional $20 million in growth funding to support further development of lakeFS. In November 2025, Treeverse announced the acquisition of the open-source data version control project DVC. == Software == === Overview === lakeFS provides Git-like operations such as branching, committing, merging, and reverting for datasets stored in object storage. These operations are used to manage changes to data, test modifications in isolation, reproduce specific data states, and recover from errors or unintended updates. === Architecture === lakeFS operates as a metadata layer on top of object storage systems such as Amazon S3, Azure Blob Storage, and Google Cloud Storage. It stores repository metadata describing commits, branches, and tags, enabling versioned views of data without copying underlying objects. The system provides access through multiple interfaces, including a web user interface, command-line tools, a REST API, and software development kits. It is designed to integrate with existing data engineering and machine learning workflows, and can be deployed either in self-hosted environments or as a managed service. === Functions === lakeFS provides version control functionality for data stored in object storage–based data lakes. Core features include: Atomic commits and version tracking for datasets, supporting reproducibility and auditability. Branching and merging mechanisms that allow isolated development and testing without duplicating data. Configurable hooks that can validate data or trigger external processes during commit and merge operations. The ability to revert repositories to earlier states to recover from data errors or failed changes. Recording of commit history and associated metadata for lineage tracking. Support for managing data across multiple object storage systems, including Amazon S3, Azure Blob Storage, Google Cloud Storage, and MinIO. Use of fixed data versions to reproduce experiments and machine learning model training. === Integrations === Coverage of lakeFS has described integrations with platforms such as Databricks and Apache Iceberg, as well as support for environments including Red Hat OpenShift. Additional materials describe its use with Trino, including validation of data changes prior to merging in versioned data workflows, as well as compatibility with orchestration tools such as Apache Airflow.
Read more →
Top 10 AI Website Builders Compared (2026)

Curious about the best AI website builder? An AI website builder is software that uses machine learning to help you get more done — it combines speed, accuracy, and an interface that just works. Hands-on testing shows real-world results vary, so a short free trial is the smartest way to decide. Whether you are a beginner or a pro, the right AI website builder slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.
Read more →
FrameNet

FrameNet is a group of online lexical databases based upon the theory of meaning known as Frame semantics, developed by linguist Charles J. Fillmore. The project's fundamental notion is simple: most words' meanings may be best understood in terms of a semantic frame, which is a description of a certain kind of event, connection, or item and its actors. As an illustration, the act of cooking usually requires the following: a cook, the food being cooked, a container to hold the food while it is being cooked, and a heating instrument. Within FrameNet, this act is represented by a frame named Apply_heat, and its components (Cook, Food, Container, and Heating_instrument), are referred to as frame elements (FEs). The Apply_heat frame also lists a number of words that represent it, known as lexical units (LUs), like fry, bake, boil, and broil. Other frames are simpler. For example, Placing only has an agent or cause, a theme—something that is placed—and the location where it is placed. Some frames are more complex, like Revenge, which contains more FEs (offender, injury, injured party, avenger, and punishment). As in the examples of Apply_heat and Revenge below, FrameNet's role is to define the frames and annotate sentences to demonstrate how the FEs fit syntactically around the word that elicits the frame. == Concepts == === Frames === A frame is a schematic representation of a situation involving various participants, props, and other conceptual roles. Examples of frame names are Being_born and Locative_relation. A frame in FrameNet contains a textual description of what it represents (a frame definition), associated frame elements, lexical units, example sentences, and frame-to-frame relations. === Frame elements === Frame elements (FE) provide additional information to the semantic structure of a sentence. Each frame has a number of core and non-core FEs which can be thought of as semantic roles. Core FEs are essential to the meaning of the frame while non-core FEs are generally descriptive (such as time, place, manner, etc.) For example: The only core FE of the Being_born frame is called Child; non-core FEs Time, Place, Means, etc. Core FEs of the Commerce_goods-transfer frame include the Seller, Buyer, and Goods, while non-core FEs include a Place, Purpose, etc. FrameNet includes shallow data on syntactic roles that frame elements play in the example sentences. For example, for a sentence like "She was born about AD 460", FrameNet would mark She as a noun phrase referring to the Child frame element, and "about AD 460" as a noun phrase corresponding to the Time frame element. Details of how frame elements can be realized in a sentence are important because this reveals important information about the subcategorization frames as well as possible diathesis alternations (e.g. "John broke the window" vs. "The window broke") of a verb. === Lexical units === Lexical units (LUs) are lemmas, with their part of speech, that evoke a specific frame. In other words, when an LU is identified in a sentence, that specific LU can be associated with its specific frame(s). For each frame, there may be many LUs associated to that frame, and also there may be many frames that share a specific LU; this is typically the case with LUs that have multiple word senses. Alongside the frame, each lexical unit is associated with specific frame elements by means of the annotated example sentences. For example, lexical units that evoke the Complaining frame (or more specific perspectivized versions of it, to be precise), include the verbs complain, grouse, lament, and others. === Example sentences === Frames are associated with example sentences and frame elements are marked within the sentences. Thus, the sentence She was born about AD 460 is associated with the frame Being_born, while She is marked as the frame element Child and "about AD 460" is marked as Time. From the start, the FrameNet project has been committed to looking at evidence from actual language use as found in text collections like the British National Corpus. Based on such example sentences, automatic semantic role labeling tools are able to determine frames and mark frame elements in new sentences. === Valences === FrameNet also exposes statistics on the valence of each frame; that is, the number and position of the frame elements within example sentences. The sentence She was born about AD 460 falls in the valence pattern NP Ext, INI --, NP Dep which occurs twice in the FrameNet's annotation report for the born.v lexical unit, namely: She was born about AD 460, daughter and granddaughter of Roman and Byzantine emperors, whose family had been prominent in Roman politics for over 700 years. He was soon posted to north Africa, and never met their only child, a daughter born 8 June 1941. === Frame relations === FrameNet additionally captures relationships between different frames using relations. These include the following: Inheritance: When one frame is a more specific version of another, more abstract, parent frame. Anything that is true about the parent frame must also be true about the child frame, and a mapping is specified between the frame elements of the parent and the frame elements of the child. Perspectivization: A neutral frame is connected to a frame with a specific perspective of the same scenario. For example, Commerce_transfer-goods is considered from the perspective of the buyer in Commerce_buy and from that of the seller in Commerce_sell. Subframe: Some frames refer to complex scenarios that consist of several individual states or events that can be described by separate frames. For example, Criminal_process is composed of Arrest, Trial, and so on. Precedence: This relation captures the temporal order that holds between subframes of a complex frame. For example, within the Cycle_of_life_and_death frame, the subframe Death is preceded by the subframe Being_born. Causative and Inchoative: These two relations mark, for causative- and inchoative-aspect frames, the separate stative frame they refer to. For example, the stative Position_on_a_scale (e.g. "She had a high salary") is described by the causative Cause_change_of_scalar_position (e.g. "She raised his salary") and by the inchoative Change_position_on_a_scale frame (e.g. "Her salary increased"). Using: This relation marks a frame that in some way involves another frame. For example, Judgment_communication uses both Judgment and Statement, but does not inherit from either of them because there is no clear correspondence of frame elements. See also: Connects frames that bear some resemblance but need to be distinguished carefully. == Applications == FrameNet has proven to be useful in a number of computational applications, because computers need additional knowledge in order to recognize that "John sold a car to Mary" and "Mary bought a car from John" describe essentially the same situation, despite using two quite different verbs, different prepositions and a different word order. FrameNet has been used in applications like question answering, paraphrasing, recognizing textual entailment, and information extraction, either directly or by means of Semantic Role Labeling tools. The first automatic system for Semantic Role Labeling (SRL, sometimes also referred to as "shallow semantic parsing") was developed by Daniel Gildea and Daniel Jurafsky based on FrameNet in 2002. Semantic Role Labeling has since become one of the standard tasks in natural language processing, with the latest version (1.7) of FrameNet now fully supported in the Natural Language Toolkit. Since frames are essentially semantic descriptions, they are similar across languages, and several projects have arisen over the years that have relied on the original FrameNet as the basis for additional non-English FrameNets, for Spanish, Japanese, German, and Polish, among others.
Read more →
DeepL Translator

DeepL is a German AI research company known for its language AI platform, which includes DeepL Translator and DeepL Voice, and for DeepL Agent, an AI agent capable of planning workflows and using office systems and tools autonomously, in response to natural language instructions. Its algorithm uses the transformer architecture. It offers a paid subscription for additional features and access to its translation application programming interface. DeepL was founded in 2017 by Jaroslaw Kutylowski and is a unicorn, valued at $2 billion after a Series C funding round raised $300 million in May 2024. Its more than 200,000 business customers include a large proportion of the Fortune 500. == History == The translating system was first developed within Linguee by a team led by Chief Technology Officer Jarosław Kutyłowski in 2016. It was launched as DeepL Translator on 28 August 2017 and offered translations between English, German, French, Spanish, Italian, Polish and Dutch. At its launch, it claimed to have surpassed its competitors in blind tests and BLEU scores, including Google Translate, Amazon Translate, Microsoft Translator and Facebook's translation feature. With the release of DeepL in 2017, Linguee's company name was changed to DeepL GmbH, and it is also financed by advertising on its sister site, linguee.com. Support for Portuguese and Russian was added on 5 December 2018. In July 2019, Jarosław Kutyłowski became the CEO of DeepL GmbH and restructured the company into a Societas Europaea in 2021. Translation software for Microsoft Windows and macOS was released in September 2019. Support for Chinese (simplified) and Japanese was added on 19 March 2020, which the company claimed to have surpassed the aforementioned competitors as well as Baidu and Youdao. Then, 13 more European languages were added in March 2021: Bulgarian, Czech, Danish, Estonian, Finnish, Greek, Hungarian, Latvian, Lithuanian, Romanian, Slovak, Slovenian, and Swedish, bringing the total number of supported languages to 24. On 25 May 2022, support for Indonesian and Turkish was added, and support for Ukrainian was added on 14 September 2022. In January 2023, the company reached a valuation of 1 billion euro and became the most valued startup company in Cologne. At the end of the month, support for Korean and Norwegian (Bokmål) was also added. In May 2024, the company announced an investment of US$300 million at AI. In January 2026, more languages were supported, including Luxembourgish and Irish. == Services == === Translation method === The service uses a proprietary algorithm with convolutional neural networks (CNNs) that have been trained with the Linguee database. According to the developers, the service uses a newer improved architecture of neural networks, resulting in a more natural sound of translations than by competing services. The translation is generated using a supercomputer that reaches 5.1 petaflops and is operated in Iceland with hydropower. DeepL's data centers are located at the EcoDataCenter in Falun, Sweden, which is a data center for sustainability. In general, CNNs are slightly more suitable for long coherent word sequences, but they have so far not been used by the competition because of their weaknesses compared to recurrent neural networks. The weaknesses of DeepL are compensated for by supplemental techniques, some of which are publicly known. === Translator and subscription === The translator can be used for free with a maximum limit of 1,500 characters per translation. Microsoft Word and PowerPoint files in Office Open XML file formats (.docx and .pptx) and PDF files up to 5MB in size can also be translated. It offers paid subscription DeepL Pro, which has been available since March 2018 and includes application programming interface access and a software plug-in for computer-assisted translation tools, including SDL Trados Studio. Unlike the free version, translated texts are stated to not be saved on the server; also, the character limit is removed. The monthly pricing model includes a set amount of text, with texts beyond that being calculated according to the number of characters. ==== Supported languages ==== As of May 2026, the translation service supports the following languages: Additionally, these languages are currently in beta, indicated by an asterisk after their name in the language picker: === DeepL Write === In November 2022, DeepL launched a tool to improve monolingual texts in English and German, called DeepL Write. In December, the company removed access and informed journalists that it was only for internal use and that DeepL Write would be relaunched in early 2023. The public beta version was then released on January 17, 2023. In the summer of 2024, DeepL announced the availability of two more languages in DeepL Write: French and Spanish. By January 2024, DeepL had added an additional two: Portuguese (European and Brazilian) and Italian. === DeepL Agent === In November 2025, DeepL launched an AI agent called DeepL Agent which is capable of operating business applications in a human-like manner. == Reception == The reception of DeepL has been generally positive. TechCrunch appreciates it for the accuracy of its translations and stating that it was more accurate and nuanced than Google Translate. Le Monde thanks its developers for translating French text into more "French-sounding" expressions. RTL Z stated that DeepL Translator "offers better translations […] when it comes to Dutch to English and vice versa". La Repubblica, and a Latin American website, "WWWhat's new?", showed praise as well. A 2018 paper by the University of Bologna evaluated the Italian-to-German translation capabilities and found the preliminary results to be similar in quality to Google Translate. In September 2021, Slator remarked that the language industry response was more measured than the press and noted that DeepL is still highly regarded by users. A reviewer noted in 2018 that DeepL had far fewer languages available for translation than competing products. == Awards and honors == DeepL won the 2020 Webby Award for Best Practices and the 2020 Webby Award for Technical Achievement (Apps, Mobile, and Features), both in the category Apps, Mobile & Voice. In April 2025, DeepL was featured in the Forbes AI 50 list.
Read more →
Serverless computing

Serverless computing is "a cloud service category where the customer can use different cloud capability types without the customer having to provision, deploy and manage either hardware or software resources, other than providing customer application code or providing customer data. Serverless computing represents a form of virtualized computing", according to ISO/IEC 22123-2. Serverless computing is a broad ecosystem that includes the cloud provider, function as a service (FaaS), managed services, tools, frameworks, engineers, stakeholders, and other interconnected elements. == Overview == Serverless is a misnomer in the sense that servers are still used by cloud service providers to execute code for developers. The definition of serverless computing has evolved over time, leading to varied interpretations. According to Ben Kehoe, serverless represents a spectrum rather than a rigid definition. Emphasis should shift from strict definitions and specific technologies to adopting a serverless mindset, focusing on leveraging serverless solutions to address business challenges. Serverless computing does not eliminate complexity but shifts much of it from the operations team to the development team. However, this shift is not absolute, as operations teams continue to manage aspects such as identity and access management (IAM), networking, security policies, and cost optimization. Additionally, while breaking down applications into finer-grained components can increase management complexity, the relationship between granularity and management difficulty is not strictly linear. There is often an optimal level of modularization where the benefits outweigh the added management overhead. According to Yan Cui, serverless techniques should be adopted only when they help to deliver customer value faster. And while adopting, organizations should take small steps and de-risk along the way. == Challenges == Serverless applications are prone to fallacies of distributed computing. In addition, they are prone to the following fallacies: Versioning is simple Compensating transactions always work Observability is optional === Monitoring and debugging === Monitoring and debugging serverless applications can present unique challenges due to their distributed, event-driven nature and proprietary environments. Traditional tools may fall short, making it difficult to track execution flows across services. However, modern solutions such as distributed tracing tools (e.g., AWS X-Ray, Datadog), centralized logging, and cloud-agnostic observability platforms are mitigating these challenges. Emerging technologies like OpenTelemetry, AI-powered anomaly detection, and serverless-specific frameworks are further improving visibility and root cause analysis. While challenges persist, advancements in monitoring and debugging tools are steadily addressing these limitations. === Security === According to OWASP, serverless applications are vulnerable to variations of traditional attacks, insecure code, and some serverless-specific attacks (like denial of wallet). So, the risks have changed and attack prevention requires a shift in mindset. === Vendor lock-in === Serverless computing is provided as a third-party service. Applications and software that run in the serverless environment are by default locked to a specific cloud vendor. This issue is exacerbated in serverless computing, as with its increased level of abstraction, public vendors only allow customers to upload code to a FaaS platform without the authority to configure underlying environments. More importantly, when considering a more complex workflow that includes backend-as-a-service (BaaS), a BaaS offering can typically only natively trigger a FaaS offering from the same provider. This makes the workload migration in serverless computing virtually impossible. Therefore, considering how to design and deploy serverless workflows from a multi-cloud perspective could mitigate this. == High-performance computing == Serverless computing may not be ideal for certain high-performance computing (HPC) workloads due to resource limits often imposed by cloud providers, including maximum memory, CPU, and runtime restrictions. For workloads requiring sustained or predictable resource usage, bulk-provisioned servers can sometimes be more cost-effective than the pay-per-use model typical of serverless platforms. However, serverless computing is increasingly capable of supporting specific HPC workloads, particularly those that are highly parallelizable and event-driven, by leveraging its scalability and elasticity. The suitability of serverless computing for HPC continues to evolve with advancements in cloud technologies. == Anti-patterns == The grain of sand anti-pattern refers to the creation of excessively small components (e.g., functions) within a system, often resulting in increased complexity, operational overhead, and performance inefficiencies. Lambda pinball is a related anti-pattern that can occur in serverless architectures when functions (e.g., AWS Lambda, Azure functions) excessively invoke each other in fragmented chains, leading to latency, debugging and testing challenges, and reduced observability. These anti-patterns are associated with the formation of a distributed monolith. These anti-patterns are often addressed through the application of clear domain boundaries, which distinguish between public and published interfaces. Public interfaces are technically accessible interfaces, such as methods, classes, API endpoints, or triggers, but they do not come with formal stability guarantees. In contrast, published interfaces involve an explicit stability contract, including formal versioning, thorough documentation, a defined deprecation policy, and often support for backward compatibility. Published interfaces may also require maintaining multiple versions simultaneously and adhering to formal deprecation processes when breaking changes are introduced. Fragmented chains of function calls are often observed in systems where serverless components (functions) interact with other resources in complex patterns, sometimes described as spaghetti architecture or a distributed monolith. In contrast, systems exhibiting clearer boundaries typically organize serverless components into cohesive groups, where internal public interfaces manage inter-component communication, and published interfaces define communication across group boundaries. This distinction highlights differences in stability guarantees and maintenance commitments, contributing to reduced dependency complexity. Additionally, patterns associated with excessive serverless function chaining are sometimes addressed through architectural strategies that emphasize native service integrations instead of individual functions, a concept referred to as the functionless mindset. However, this approach is noted to involve a steeper learning curve, and integration limitations may vary even within the same cloud vendor ecosystem. Reporting on serverless databases presents challenges, as retrieving data for a reporting service can either break the bounded contexts, reduce the timeliness of the data, or do both. This applies regardless of whether data is pulled directly from databases, retrieved via HTTP, or collected in batches. Mark Richards refers to this as the reach-in reporting anti-pattern. A possible alternative to this approach is for databases to asynchronously push the necessary data to the reporting service instead of the reporting service pulling it. While this method requires a separate contract between services and the reporting service and can be complex to implement, it helps preserve bounded contexts while maintaining a high level of data timeliness. == Principles == Adopting DevSecOps practices can help improve the use and security of serverless technologies. In serverless applications, the distinction between infrastructure and business logic is often blurred, with applications typically distributed across multiple services. To maximize the effectiveness of testing, integration testing is emphasized for serverless applications. Additionally, to facilitate debugging and implementation, orchestration is used within the bounded context, while choreography is employed between different bounded contexts. Ephemeral resources are typically kept together to maintain high cohesion. However, shared resources with long spin-up times, such as AWS RDS clusters and landing zones, are often managed in separate repositories, deployment pipeline, and stacks.
Read more →
Douwe Kiela

Douwe Kiela is a Dutch-American research scientist and entrepreneur working in the field of artificial intelligence with a focus on machine learning and natural language processing. He is a research scientist director at Google DeepMind. He previously co-founded and served as CEO of Contextual AI, an enterprise software company that provides a platform for building grounded AI agents for enterprise knowledge bases. He previously led the research team at Meta AI that introduced the RAG approach in 2020, co-authoring the foundational paper "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks." Kiela also served as Head of Research at Hugging Face and is an adjunct professor in Symbolic Systems at Stanford University. == Early life and education == Douwe Kiela was born in Amsterdam, Netherlands, in 1986. He earned a Bachelor of Science degree in Liberal Arts and Sciences from Utrecht University, with a double major in Cognitive Artificial Intelligence and Philosophy. He then obtained an MSc in logic (cum laude) from the University of Amsterdam's Institute for Logic, Language and Computation (ILLC). Kiela received an MPhil and PhD in Computer Science from the University of Cambridge, specializing in natural language processing and machine learning. == Career == === Facebook AI Research (Meta) === In 2016, Kiela joined Facebook AI Research (FAIR) as a postdoctoral researcher, later becoming a research scientist in New York. While at Meta, he co-authored papers in natural language processing, with a focus on multimodal and grounded language learning. His projects included creating a virtual assistant bot that could navigate tourists around a city and leading the development of Dynabench, an interactive benchmarking platform released in 2020 that used human feedback to test and improve language models. In 2020, Kiela led the Meta AI research team that introduced Retrieval-Augmented Generation (RAG), co-authoring the influential paper "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks," alongside Patrick Lewis, Ethan Perez, and other researchers. The RAG framework transformed how large language models access and incorporate external information by allowing them to retrieve relevant context from external knowledge bases at query time, rather than relying solely on pre-trained data. This approach addressed key limitations such as hallucination, outdated information, and lack of source attribution. The RAG technique has since become widely adopted in enterprise AI applications and knowledge-intensive natural language processing tasks. === Hugging Face === After leaving Meta, Kiela served as Head of Research at Hugging Face. === Contextual AI === In 2023, Kiela co-founded Contextual AI with Amanpreet Singh, another former researcher at Facebook AI Research and Hugging Face. The Mountain View-based company develops a platform for building grounded AI agents for enterprises, focusing on applications in technology, semiconductor, logistics, finance, and media sectors. Contextual AI raised $20 million in seed funding in June 2023, led by Bain Capital Ventures. In August 2024, the company completed an $80 million Series A funding round led by Greycroft, with participation from Bezos Expeditions, NVentures (Nvidia), HSBC Ventures, and Snowflake Ventures, among others. In May 2026, Kiela joined Google DeepMind as part of a licensing agreement between Google and Contextual AI under which more than 20 Contextual AI researchers joined DeepMind. Following his departure, Jay Chen became interim CEO of Contextual AI. === Academic roles === Douwe Kiela serves as an adjunct professor in Symbolic Systems at Stanford University. In a 2023 interview with the Stanford Daily, he commented on the development of Alpaca, a low-cost instruction-finetuned model based on Meta's LLaMA, and emphasized the importance of open academic research in large language models.
Read more →
Is an AI Text-to-image Tool Worth It in 2026?

Trying to pick the best AI text-to-image tool? An AI text-to-image tool is software that uses machine learning to help you get more done — it scales effortlessly from a single task to thousands. The best picks balance beginner-friendly simplicity with the depth power users need, and they ship updates often. Whether you are a beginner or a pro, the right AI text-to-image tool slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.
Read more →
Georgetown–IBM experiment

The Georgetown–IBM experiment was an influential demonstration of machine translation, which was performed on January 7, 1954. Developed jointly by Georgetown University and IBM, the experiment involved completely automatic translation of more than sixty Russian sentences into English. == Background == Conceived and performed primarily in order to attract governmental and public interest and funding by showing the possibilities of machine translation, it was by no means a fully featured system: It had only six grammar rules and 250 lexical items in its vocabulary (of stems and endings). Words in the vocabulary were in the fields of politics, law, mathematics, chemistry, metallurgy, communications and military affairs. Vocabulary was punched onto punch cards. This complete dictionary was never fully shown (only the extended one from Garvin's article). Apart from general topics, the system was specialized in the domain of organic chemistry. The translation was carried out using an IBM 701 mainframe computer (launched in April 1953). The Georgetown-IBM experiment is the best-known result of the MIT conference in June 1952 to which all active researchers in the machine translation field were invited. At the conference, Duncan Harkin from US Department of Defense suggested that his department would finance a new machine translation project. Jerome Weisner supported the idea and offered finance from the Research Laboratory of Electronics at MIT. Leon Dostert had been invited to the project for his previous experience with the automatic correction of translations (back then 'mechanical translation'); his interpretation system had a strong impact on the Nuremberg War Crimes Tribunal. The linguistics part of the demonstration was carried out for the most part by linguist Paul Garvin who had also good knowledge of Russian. Over 60 Romanized Russian statements from a wide range of political, legal, mathematical, and scientific topics were entered into the machine by a computer operator who knew no Russian, and the resulting English translations appeared on a printer. The sentences to be translated were carefully selected. Many operations for the demonstration were fitted to specific words and sentences. In addition, there was no relational or sentence analysis which could recognize the sentence structure. The approach was mostly 'lexicographical' based on a dictionary where a specific word had a connection with specific rules and steps. == Algorithm == The algorithm first translates Russian words into numerical codes, then performs the following case-analysis on each numerical code to choose between possible English word translations, reorder the English words, or omit some English words. The flowchart of the algorithm is reproduced in (see Table 1 for the 6 rules). == Translation examples == How it analyzes Vyelyichyina ugla opryedyelyayetsya otnoshyenyiyem dlyini dugi k radyiusu (figure 2 of ). == Reception == Well publicized by journalists and perceived as a success, the experiment did encourage governments to invest in computational linguistics. The authors claimed that within three or five years, machine translation could well be a solved problem. However, the real progress was much slower, and after the ALPAC report in 1966, which found that the ten years of long research had failed to fulfill the expectations, funding was reduced dramatically. The demonstration was given widespread coverage in the foreign press, but only a small fraction of journalists drew attention to previous machine translation attempts.
Read more →
Distributed concurrency control

Distributed concurrency control is the concurrency control of a system distributed over a computer network (Bernstein et al. 1987, Weikum and Vossen 2001). In database systems and transaction processing (transaction management) distributed concurrency control refers primarily to the concurrency control of a distributed database. It also refers to the concurrency control in a multidatabase (and other multi-transactional object) environment (e.g., federated database, grid computing, and cloud computing environments. A major goal for distributed concurrency control is distributed serializability (or global serializability for multidatabase systems). Distributed concurrency control poses special challenges beyond centralized one, primarily due to communication and computer latency. It often requires special techniques, like distributed lock manager over fast computer networks with low latency, like switched fabric (e.g., InfiniBand). The most common distributed concurrency control technique is strong strict two-phase locking (SS2PL, also named rigorousness), which is also a common centralized concurrency control technique. SS2PL provides both the serializability and strictness. Strictness, a special case of recoverability, is utilized for effective recovery from failure. For large-scale distribution and complex transactions, distributed locking's typical heavy performance penalty (due to delays, latency) can be saved by using the atomic commitment protocol, which is needed in a distributed database for (distributed) transactions' atomicity.
Read more →
Localization Industry Standards Association

Localization Industry Standards Association or LISA was a Swiss-based trade body concerning the translation of computer software (and associated materials) into multiple natural languages, which existed from 1990 to February 2011. It counted among its members most of the large information technology companies of the period, including Adobe, Cisco, Hewlett-Packard, IBM, McAfee, Nokia, Novell and Xerox. LISA played a significant role in representing its partners at the International Organization for Standardization (ISO), and the TermBase eXchange (TBX) standard developed by LISA was submitted to ISO in 2007 and became ISO 30042:2008. LISA also had a presence at the W3C. A number of the LISA standards are used by the OASIS Open Architecture for XML Authoring and Localization framework. LISA shut down on 28 February 2011, and its website went offline shortly afterwards. In the wake of the closure of LISA, the European Telecommunications Standards Institute started an Industry Specification Group (ISG) for localization. The ISG has five work items: Term-Base eXchange (TBX) / ISO 30042:2008 Translation Memory eXchange (TMX), with GALA Segmentation Rules eXchange (SRX) / ISO/CD 24621) Global information management Metrics eXchange – Volume (GMX-V); Another organization that was formed in response to the closure of LISA is Terminology for Large Organizations (TerminOrgs), a consortium of terminology professionals who promote terminology management best practices.
Read more →
Nicolò Cesa-Bianchi

Nicolò Cesa-Bianchi (Italian pronunciation: [nikoˈlɔ tˈtʃɛːza ˈbjaŋki]) is an Italian computer scientist and Professor of Computer Science at the Department of Computer Science of the University of Milan. He is a researcher in the field of machine learning, and co-author of the books "Prediction, Learning, and Games" with Gabor Lugosi and "Regret analysis of stochastic and nonstochastic multi-armed bandit problems" with Sébastien Bubeck == Education and career == Cesa-Bianchi graduated in Computer Science from the University of Milan in 1988 where he received a PhD in Computer Science in 1993 supervised by Alberto Bertoni. During his PhD, he visited UC Santa Cruz where he worked with Manfred Warmuth and David Haussler. He did his postdoctoral studies at Graz University of Technology under the supervision of Wolfgang Maass. == Research == His research contributions focus on the following areas: design and analysis of machine learning algorithms, especially in online machine learning algorithms for multi-armed bandit problems, with applications to recommender systems and online auctions graph analytics, with applications to social networks and bioinformatics == Awards and honors == Cesa-Bianchi received a Google Research Award in 2010, a Xerox University Affairs Committee Award in 2011, a Criteo Faculty Award in 2017, a Google Faculty Award in 2018, and a IBM Academic Award in 2021. Since 2023 he is corresponding member of the Accademia dei Lincei.
Read more →
Yun Sing Koh

Yun Sing Koh (born 1978) is a New Zealand computer science academic, and is a full professor at the University of Auckland, specialising in machine learning and artificial intelligence. She is a co-director of the Centre of Machine Learning for Social Good, and the Advanced Machine Learning and Data Analytics Research (MARS) Lab at Auckland. == Academic career == Koh earned a Bachelor of Science with Honours and a Master of Software Engineering at the University of Malaya. She then completed a PhD titled Generating sporadic association rules at the University of Otago in 2007. Koh joined the faculty of the University of Auckland in 2010, rising to full professor. As of 2024, she is director of the Centre of Machine Learning for Social Good at Auckland, alongside Gillian Dobbie and Daniel Wilson, and is director of the Master of AI course at the university. Koh also co-directs the Advanced Machine Learning and Data Analytics Research (MARS) Lab. Koh's research covers machine learning and artificial intelligence. She is especially interested in designing machine learning algorithms for data streams, and has led research using AI systems to identify individual stoats for pest population research. In 2018 she was awarded a Marsden grant for a research project "An Adaptive Predictive System for Life-long Learning on Data Streams", and has been part of three MBIE projects. In 2025 the stoat identification project Koh co-leads with Daniel Wilson was awarded $1 million per annum by the MBIE Smart Ideas fund. Koh was a finalist in the AI in Climate section of the Women in AI Australia and New Zealand Awards in 2022. She was a 2023 Fellow at the United States National Science Foundation-funded Convergence Research (CORE) Institute. Koh has chaired a number of sessions at international conferences on data mining. In March 2026 it was announced that Koh would be a member of the New Zealand Human Rights Commission's Expert Advisory Group on Artificial Intelligence, Emerging Digital Technologies and Human Rights. == Selected works == Philippe Fournier-Viger; Jerry Chun-Wei Lin; Rage Uday Kiran; Yun Sing Koh; Rincy Thomas (2017). "A Survey of Sequential Pattern Mining". Data Science and Pattern Recognition. 1 (1): 54–77. Wikidata Q138719481. Yun Sing Koh; Nathan Rountree; Richard O’Keefe (1 April 2006). "Finding Non-Coincidental Sporadic Rules Using Apriori-Inverse". International Journal of Data Warehousing and Mining (in Ndonga). 2 (2): 38–54. doi:10.4018/JDWM.2006040102. ISSN 1548-3924. Wikidata Q125185222. Russel Pears; Sripirakas Sakthithasan; Yun Sing Koh (11 January 2014). "Detecting concept change in dynamic data streams". Machine Learning. 97 (3): 259–293. doi:10.1007/S10994-013-5433-9. ISSN 1573-0565. Zbl 1319.68186. Wikidata Q125185156. David Tse Jung Huang; Yun Sing Koh; Gillian Dobbie; Russel Pears (December 2014), Detecting Volatility Shift in Data Streams, Institute of Electrical and Electronics Engineers, doi:10.1109/ICDM.2014.50, Wikidata Q125185151 Sidney Tsang; Yun Sing Koh; Gillian Dobbie (2011). "RP-Tree: Rare Pattern Tree Mining". Lecture Notes in Computer Science: 277–288. doi:10.1007/978-3-642-23544-3_21. ISSN 0302-9743. Wikidata Q125185206. Yun Sing Koh; Sri Devi Ravana (24 May 2016). "Unsupervised Rare Pattern Mining". ACM Transactions on Knowledge Discovery from Data. 10 (4): 1–29. doi:10.1145/2898359. ISSN 1556-4681. Wikidata Q125185136. Jack Julian; Yun Sing Koh; Albert Bifet (1 October 2025), Building adaptive knowledge bases for evolving continual learning models (PDF), vol. 1, doi:10.1038/S44387-025-00028-4, Wikidata Q138719496
Read more →
Reference Software International

Reference Software International, Inc. (RSI), was an American software developer active from 1985 to 1993 and based in Albuquerque, New Mexico, and San Francisco, California. The company released several productivity and reference software packages, including the Grammatik grammar checker, for MS-DOS. The company was acquired by WordPerfect Corporation in 1993. == History == === Background (1980–1985) === Reference Software International, Inc., was founded by Donald "Don" Emery and Bruce Wampler in 1985 in San Francisco, California. Both Wampler and Emery were college professors when they founded RSI: Wampler at the University of New Mexico as a professor of computer science and Emery a professor of marketing at San Francisco State University. After graduating from the University of Utah in around 1978, Wampler founded his first software company, Aspen Software, in Tijeras, New Mexico, in 1979. Wampler founded Aspen to develop an early spell checker software package, called Proofreader, for the TRS-80, licensing Random House's Webster's Unabridged Dictionary for the package's lexicon. In 1980, he began development on a grammar checker inspired by Writer's Workbench, a pioneering grammar checker for Unix systems. Wampler used Writer's Workbench heavily during the writer of his doctoral dissertation but disliked having to jump between the Apple II on which he composed the dissertation and the mainframe on which Writer's Workbench ran, and so wanted to develop a version of the latter for microcomputers. Wampler's work came to fruition as Grammatik in 1981, eventually ported to several other microcomputer platforms in the early 1980s. In 1983, by which point the company had 12 employees and sold a combined 80,000 units of Grammatik and Proofreader, Wampler sold Aspen to Dictronics, a software company best known for developing the Electronic Thesaurus, an early thesaurus program for microcomputers. Dictronics was in turn purchased by Wang Laboratories; according to Wampler, "Wang bought [Aspen] and sat on it. They did nothing with it". Wampler moved on to teach for the University of New Mexico, but, frustrated by Wang's inaction, got the urge to resurrect his work. In 1985, he was able to license back Grammatik and Proofreader from a small California-based software firm that had grandfathered rights to a forked version of both. In the same year, he met Emery, who, impressed by Wampler's, founded Reference Software International to market his software. RSI's research and development headquarters were based in Albuquerque, while the company's sales and marketing department was based in Walnut Creek, California. === Success (1985–1992) === In August 1985, RSI released their first product: the Random House Reference Set, a new version of Proofreader for the IBM Personal Computer and compatibles, revised to be a terminate-and-stay-resident program that ran atop other word processors such as WordStar or WordPerfect. At the time, Reference Set was the only such program on the market that functioned like this. RSI netted $114,000 from sales of Reference Set by the end of 1985. In June 1986, they released version 2.0 of Grammatik as Grammatik II for the PC. The latter was a breakout hit for RSI, receiving praise in the press (including technology journals such as PC Magazine) and RSI selling 1,000 units a month. In spring 1987, they released Reference Set II, which allowed users to import their own words into the built-in dictionary and added a thesaurus of 300,000 words. In November 1987, they released version 3.0 of Reference Set, which comprised two new field-specific dictionaries for the medical and legal professions. As well as the general Random House dictionary and thesaurus, it included Stedman's Medical Dictionary and Black's Law Dictionary. Emery consulted Paul Brest and Bob Jackson—professors of law at Stanford Law School and San Francisco State respectively—for the curation of the law dictionary; and Burton Grebin—at the time the executive director of Mount Saint Mary's Hospital—for the curation of the medical dictionary. In fall 1988, the company released Grammatik III, a total rewrite that made use of artificial intelligence to more accurately judge the grammar of sentences by breaking them down into a syntactic hierarchy. Grammatik III received universal acclaim, with Gloria Morris of InfoWorld calling it the apparent leader in the grammar checking field and Sandra Anderson of Mac Home Journal calling it "hands down ... the best of the industry" six years after its release. By 1989, the product had competitors in Correct Grammar by Lifetree Software and RightWriter by Rightsoft, Inc. By 1990, RSI achieved annual sales of $9.7 million. In the same year they released Grammatik IV, which was the first to offer direct integration with WordPerfect on both MS-DOS and Windows. In March 1992—by which point RSI had sold 1.5 million copies of Grammatik across all versions—the company released version 5 of the program, another rewrite that updated the lexicon further and added new functions such as word redundancy detection. Around the same time, the company introduced Easy Proof, a pared-down version of Grammatik intended for novice writers, students, and family computers. In 1991, the company was engaged in a trademark dispute with Systems Compatibility Corporation (SCC) of Chicago, Illinois, over the rights to the Software Toolkit title. Both companies had published software bundles bearing the name in the turn of the 1990s; SCC had published theirs first in 1988 and registered the trademark with the USPTO. SCC was granted a restraining order against RSI in January 1991. The following month, RSI agreed to rename their product, preventing a protracted legal battle. === Decline and acquisition (1992–1993) === By early 1992, RSI achieved annual sales of more than $13 million, employed 120 people, and had opened international offices in London, Belgium, and Antwerp to sell foreign versions of Reference Set and Grammatik. The company reached peak employment in the middle of 1992, with 140 employees. However, RSI's launch of six disparate titles in the year proved problematic for the company when they failed to sell as well as they had projected, and the company laid off employees by the dozens. By December 1992, only 71 employees were left, 32 from their San Francisco office. On the last day of 1992, RSI received an acquisition offer from WordPerfect Corporation, makers of the namesake word processor based in Orem, Utah. The deal was inked in January 1993, RSI's stakeholders receiving $19 million. The company's remaining employees were absorbed into WordPerfect in Orem. WordPerfect continued selling Grammatik as a standalone product for several years.
Read more →
Deborah Raji

Inioluwa Deborah Raji (born 1995/1996) is a Nigerian-Canadian computer scientist and socio-tech leader who works on algorithmic bias, AI accountability, and algorithmic auditing. A current Mozilla fellow, she has been recognized by MIT Technology Review and Forbes as one of the world's top young innovators. Raji started her work with racial bias in technology during her internship with Clarifai when she recognized that people of color were more often tagged for NSFW compared to white people. Raji has previously worked with Joy Buolamwini, Timnit Gebru, and the Algorithmic Justice League on researching gender and racial bias in facial recognition technology. Her work on racial bias in facial recognition has forced companies to ultimately change their practices. She has also worked with Google’s Ethical AI team and been a research fellow at the Partnership on AI and AI Now Institute at New York University working on how to operationalize ethical considerations in machine learning engineering practice. She was working on a computer vision model that would help clients flag inappropriate images as NSFW. == Early life and education == Raji was born in Port Harcourt, Nigeria, and moved to Mississauga, Ontario, Canada, when she was four years old. Eventually her family moved to Ottawa. She attended Colonel By Secondary School and completed the International Baccalaureate programme. She studied Engineering Science at the University of Toronto, graduating in 2019. In 2015, she founded Project Include, a nonprofit providing increased student access to engineering education, mentorship, and resources in low income and immigrant communities in the Greater Toronto Area. She started a Doctor of Philosophy - PhD, in Computer Science from the University of California, Berkeley in Aug 2021. == Career and research == Raji worked with Joy Buolamwini at the MIT Media Lab and Algorithmic Justice League, where she audited commercial facial recognition technologies from Microsoft, Amazon, IBM, Face++, and Kairos. They found that these technologies were significantly less accurate for darker-skinned women than for white men. With support from other top AI researchers and increased public pressure and campaigning, their work led IBM and Amazon to agree to support facial recognition regulation and later halt the sale of their product to police for at least a year. Raji also interned at machine learning startup Clarifai, where she worked on a computer vision model for flagging images. She participated in a research mentorship program at Google and worked with their Ethical AI team on creating model cards, a documentation framework for more transparent machine learning model reporting. She also co-led the development of internal auditing practices at Google. Her contributions at Google were separately presented and published at the AAAI conference and ACM Conference on Fairness, Accountability, and Transparency. In 2019, Raji was a summer research fellow at The Partnership on AI working on setting industry machine learning transparency standards and benchmarking norms. Raji was a Tech Fellow at the AI Now Institute worked on algorithmic and AI auditing. Currently, she is a fellow at the Mozilla Foundation researching algorithmic auditing and evaluation. Raji's work on bias in facial recognition systems has been highlighted in the 2020 documentary Coded Bias directed by Shalini Kantayya. She also took part in the 2026 documentary The AI Doc: Or How I Became an Apocaloptimist directed by Daniel Roher. == Awards == 2019 Venture Beat AI Innovations Award in category AI for Good (received with Joy Buolamwini and Timnit Gebru) 2020 MIT Technology Review 35 Under 35 Innovator Award 2020 EFF Pioneer Award (received with Buolamwini and Gebru) 2021 Forbes 30 Under 30 Award in Enterprise Technology 2021 100 Brilliant Women in AI Ethics Hall of Fame Honoree 2023 Time magazine 100 Most Influential People in AI
Read more →
Aslı Çelikyılmaz

Aslı Çelikyılmaz is an engineer specializing in natural language processing, and particularly in natural language generation for software agents with advanced reasoning and real-world modeling capabilities. Educated in Turkey and Canada, she works in the US as senior research lead at Fundamentals AI Research, Meta. She also holds an affiliate faculty position in computer science at the University of Washington, and is co-editor-in-chief of the journal Transactions of the Association for Computational Linguistics. == Education and career == Çelikyılmaz is a 1997 graduate of Istanbul Technical University, where she studied industrial engineering. After a 2002 master's degree in computer and information science from Seneca Polytechnic in Toronto, and a second master's degree in information science from the University of Toronto in 2005, she completed a Ph.D. in information science at the University of Toronto in 2008. She worked as a postdoctoral researcher in California, at the University of California, Berkeley, from 2008 to 2010. In 2010 she joined Microsoft in Sunnyvale, California, where she became a senior scientist and later a senior principal researcher in Redmond, Washington. She added her affiliation with the University of Washington in 2018, and moved to Meta in Seattle in 2021. == Recognition == Çelikyılmaz was named to the 2026 class of IEEE Fellows, "for contributions to conversational systems and language generation".
Read more →