Knapsack cryptosystems are cryptosystems whose security is based on the hardness of solving the knapsack problem. They remain quite unpopular because simple versions of these algorithms have been broken for several decades. However, that type of cryptosystem is a good candidate for post-quantum cryptography. The most famous knapsack cryptosystem is the Merkle-Hellman Public Key Cryptosystem, one of the first public key cryptosystems, published the same year as the RSA cryptosystem. However, this system has been broken by several attacks: one from Shamir, one by Adleman, and the low density attack. However, there exist modern knapsack cryptosystems that are considered secure so far: among them is Nasako-Murakami 2006. Knapsack cryptosystems, when not subject to classical cryptoanalysis, are believed to be difficult even for quantum computers. That is not the case for systems that rely on factoring large integers, like RSA, or computing discrete logarithms, like ECDSA, problems solved in polynomial time with Shor's algorithm.
CityEngine
ArcGIS CityEngine is a commercial 3D modeling program. Developed by Esri R&D Center Zurich (formerly Procedural Inc.), it specializes in the generation of 3D urban environments to support the creation of detailed large-scale 3D city models. Unlike traditional 3D modeling methodology, which uses computer-aided design (CAD) tools and techniques, CityEngine takes a procedural modeling approach which shapes generation via a rules-based system. Due to its integration with the wider ArcGIS platform, CityEngine can also be used with geographic information system (GIS) datasets. CityEngine can be used for urban planning and architecture, graphics visualization, game development, entertainment, and archeology. CityEngine can be used to visualize the building information modeling (BIM) data of buildings in a larger urban context, making for more realistic construction projects. == History and releases == === Software history === ArcGIS CityEngine, originally named Esri CityEngine, was developed at Swiss technology university ETH Zurich by Pascal Mueller, the co-founder and CEO of Procedural Inc. While researching for his PhD at the ETH Computer Vision Lab, Mueller invented a number of techniques for procedural modeling of 3D architecture that make up the foundation of CityEngine. CityEngine publically debuted at the 2001 SIGGRAPH conference; since then, additional research papers have been published that have contributed to CityEngine and its features. The first commercial version of CityEngine was released in 2008. In 2007, Procedural Inc. was founded and separated from ETH Zurich, the top-ranking technology university in Switzerland. In the summer of 2011, Procedural Inc. was acquired by Esri Inc., becoming Esri R&D Center Zurich. Esri CityEngine was renamed to ArcGIS CityEngine in June 2020 to offically make it a part of the ArcGIS software suite. === Releases === === Licensing and pricing === ArcGIS CityEngine is included in the Professional and Professional Plus tiers of ArcGIS Online. Pricing may vary by region and distributors. In the US, the professional tier costs US$2,200 per year; in the UK, it is £4,200 per year (excluding VAT). CityEngine can be purchased elsewhere via a local Esri partner. . Once purchased, users can download and obtain license details from the MyEsri portal. == Features == CGA (computer generated architecture) parametric modeling rules to control mass, geometry assets, proportions, or texturing of buildings or streets on a citywide scale Select a target location and import geo-referenced satellite imagery and 3D terrain of the location to more quickly build accurate urban environments through OpenStreetMap integration Interactively control specific street or building parameters, such as height or age Import/export geo-spatial/vector data with industry-standard formats such as Esri Shapefile, File Geodatabase, and OpenStreetMap, as well as file formats for WebGL, KMZ, Collada, Autodesk FBX, Autodesk Maya, 3DS, Wavefront OBJ, RenderMan RIB, Alembic, e-on software's Vue, Universal Scene Description USD, Khronos Group GLTF, Unreal Engine, and Unreal Datasmith Script and generate rules-based reports to show socioeconomic figures (e.g., Gross Floor Area (GFA) and Floor Area Ratio (FAR)) to analyze their urban design proposals. VR viewing of modeled environments with Samsung Gear VR Use a variety of materials through the Esri materials library == Procedural modeling == ArcGIS CityEngine uses a procedural modeling approach to automatically generate models through a predefined rule set. The rules are defined through a CGA shape grammar system, enabling the creation of complex parametric models. Users can change or add the shape grammar as needed. Urban environments can be modeled within CityEngine by starting with creating a street network (either from the street drawing tool or with data imported from map data). Then, lots may be subdivided as many times as specified, resulting in a map of multiple lots and streets. CityEngine can then be instructed to start generating the buildings using defined procedural modeling rules. At this point, the city model can be re-designed and adjusted by changing the parameters or the shape grammar. === Geodesign === Though CityEngine is not an analytical tool like GIS, discussions about geodesign often mention the use of ArcGIS CityEngine. As it can be used to enhance 3D shape generation in ArcGIS, ArcGIS CityEngine is a critical product to improve the applicability of geodesign by using geospatial information to design or analyze a city. == Applications == === Urban design and planning === Garsdale Design used ArcGIS CityEngine in the creation of city master plans in Iraq before 2013, both to model existing historic areas and also model future plans. Larger companies like Foster+Partners and HOK Architects have also used CityEngine in their urban planning projects. === Urban and environmental studies === Because its primary feature is building informative city models, some urban researchers use CityEngine to compare land-use planning schemes, for example in very dense global cities such as Hong Kong and Seoul. Environmental scientists can also utilize the instant 3D model generation in CityEngine, which can make for more convenient informative research than modeling a city by creating each building individually. === Game development === CityEngine can be used as a tool in the creation of video games that require detailed 3D environments to assign interactive scripts. === Movie industry === Zootopia (also known outside of the US as Zootopolis), which won the 2016 Academy Award for Best Animated Feature Film, used CityEngine to model the city in its movie. multi-scaling city, the designers used CityEngine due to its rule-based system. CityEngine was also used to create Big Hero 6's San-Fransokyo. === Military === Due to its integration with the Esri product suite and its ability to process geospatial data to create 3D scenes/maps, CityEngine can be used within military/defense organizations. == List of movies and TV shows using CityEngine == Studios and companies rarely state what software they use in their pipelines. When CityEngine is mentioned as a tool in production, it's often in a small reference in a larger article. Movies only claimed to use CityEngine by a single Esri employee Presented at FMX 2025 workshop == Ports == ArcGIS CityEngine is built on top of Eclipse IDE, and has therefore able to be used on Windows and Linux operating systems. Support for macOS was stopped in March 2021. == Plugins and extensions == ArcGIS CityEngine currently works with a number of third party 3D modeling, rendering, and analytical software products via its SDK and API; these currently are: ArcGIS CityEngine for ArcGIS Urban: ArcGIS Urban Suite Puma: ArcGIS CityEngine for Rhinoceros 3D Palladio: ArcGIS CityEngine for Houdini Serlio: ArcGIS CityEngine for Maya PyPRT: ArcGIS CityEngine for Python ArcGIS CityEngine provides a Python scripting interface built on Jython (current version 2.7.0) which allows users to create their own tools and functionality. == Publications ==
Glottochronology
Glottochronology (from Attic Greek γλῶττα 'tongue, language' and χρόνος 'time') is the part of lexicostatistics which involves comparative linguistics and deals with the chronological relationship between languages. The idea was developed by Morris Swadesh in the 1950s in his article on Salish internal relationships. He developed the idea under two assumptions: there indeed exists a relatively stable basic vocabulary (referred to as Swadesh lists) in all languages of the world; and, any replacements happen in a way analogous to radioactive decay in a constant percentage per time elapsed. Using mathematics and statistics, Swadesh developed an equation to determine when languages separated and give an approximate time of when the separation occurred. His methods aimed to aid linguistic anthropologists by giving them a definitive way to determine a separation date between two languages. The formula provides an approximate number of centuries since two languages were supposed to have separated from a singular common ancestor. His methods also purported to provide information on when ancient languages may have existed. Despite multiple studies and literature containing the information of glottochronology, it is not widely used today and is surrounded with controversy. Glottochronology tracks language separation from thousands of years ago but many linguists are skeptical of the concept because it is more of a 'probability' rather than a 'certainty.' On the other hand, some linguists may say that glottochronology is gaining traction because of its relatedness to archaeological dates. Glottochronology is not as accurate as archaeological data, but some linguists still believe that it can provide a solid estimate. Over time many different extensions of the Swadesh method evolved; however, Swadesh's original method is so well known that 'glottochronology' is usually associated with him. == Methodology == The original method of glottochronology presumed that the core vocabulary of a language is replaced at a constant (or constant average) rate across all languages and cultures and so can be used to measure the passage of time. The process makes use of a list of lexical terms and morphemes which are similar to multiple languages. Lists were compiled by Morris Swadesh and assumed to be resistant against borrowing (originally designed in 1952 as a list of 200 items, but the refined 100-word list in Swadesh (1955) is much more common among modern day linguists). The core vocabulary was designed to encompass concepts common to every human language such as personal pronouns, body parts, heavenly bodies and living beings, verbs of basic actions, numerals, basic adjectives, kin terms, and natural occurrences and events. Through a basic word list, one eliminates concepts that are specific to a particular culture or time period. It has been found through differentiating word lists that the ideal is really impossible and that the meaning set may need to be tailored to the languages being compared. Word lists are not homogenous throughout studies and they are often changed and designed to suit both languages being studied. Linguists find that it is difficult to find a word list where all words used are culturally unbiased. Many alternative word lists have been compiled by other linguists and often use fewer meaning slots. The percentage of cognates (words with a common origin) in the word lists is then measured. The larger the percentage of cognates, the more recently the two languages being compared are presumed to have separated. === Glottochronologic constant === Determining word lists rely on morpheme decay or change in vocabulary. Morpheme decay must stay at a constant rate for glottochronology to be applied to a language. This leads to a critique of the glottochronologic formula because some linguists argue that the morpheme decay rate is not guaranteed to stay the same throughout history. American Linguist Robert Lees obtained a value for the "glottochronological constant" (r) of words by considering the known changes in 13 pairs of languages using the 200 word list. He obtained a value of 0.8048 ± 0.0176 with 90% confidence. For his 100-word list Swadesh obtained a value of 0.86, the higher value reflecting the elimination of semantically unstable words. === Divergence time === The basic formula of glottochronology proposed by Morris Swadesh is: t = − ln ( c ) 2 ln ( r ) {\displaystyle t=-{\frac {\ln(c)}{2\ln(r)}}} t = a given period of time from one stage of the language to another (measured in millennia), c = proportion of wordlist items retained at the end of that period and r = rate of replacement for that word list. By testing historically verifiable cases in which t is known by nonlinguistic data (such as the approximate distance from Classical Latin to modern Romance languages), Swadesh arrived at the empirical value of approximately 0.14 for L, (c?) which means that the rate of replacement constitutes around 14 words from the 100-wordlist per millennium. This is represented in the table below. === Results === Glottochronology was applied to a range of language families, including Salishan, Indo-European, Japonic, Afro-Asiatic, Chinese and Mayan and other American languages. For Amerind, correlations have been obtained with radiocarbon dating and blood groups as well as archaeology. === Example Wordlist === Below is an example of a basic word list composed of basic Turkish words and their English translations. == Discussion == The concept of language change is old, and its history is reviewed in Hymes (1973) and Wells (1973). In some sense, glottochronology is a reconstruction of history and can often be closely related to archaeology. Many linguistic studies find the success of glottochronology to be found alongside archaeological data. Glottochronology itself dates back to the mid-20th century. An introduction to the subject is given in Embleton (1986) and in McMahon and McMahon (2005). Glottochronology has been controversial ever since, partly because of issues of accuracy but also because of the question of whether its basis is sound (for example, Bergsland 1958; Bergsland and Vogt 1962; Fodor 1961; Chrétien 1962; Guy 1980). The concerns have been addressed by Dobson et al. (1972), Dyen (1973) and Kruskal, Dyen and Black (1973). The assumption of a single-word replacement rate can distort the divergence-time estimate when borrowed words are included (Thomason and Kaufman 1988). The presentations vary from "Why linguists don't do dates" to the one by Starostin discussed below. Since its original inception, glottochronology has been rejected by many linguists, mostly Indo-Europeanists of the school of the traditional comparative method. Criticisms have been answered in particular around three points of discussion: Criticism levelled against the higher stability of lexemes in Swadesh lists alone (Haarmann 1990) misses the point because a certain amount of losses only enables the computations (Sankoff 1970). The non-homogeneity of word lists often leads to lack of understanding between linguists. Linguists also have difficulties finding a completely unbiased list of basic cultural words. it can take a long time for linguists to find a viable word list which can take several test lists to find a usable list. Traditional glottochronology presumes that language changes at a stable rate. Thus, in Bergsland & Vogt (1962), the authors make an impressive demonstration, on the basis of actual language data verifiable by extralinguistic sources, that the "rate of change" for Icelandic constituted around 4% per millennium, but for closely connected Riksmal (Literary Norwegian), it would amount to as much as 20% (Swadesh's proposed "constant rate" was supposed to be around 14% per millennium). That and several other similar examples effectively proved that Swadesh's formula would not work on all available material, which is a serious accusation since evidence that can be used to "calibrate" the meaning of L (language history recorded during prolonged periods of time) is not overwhelmingly large in the first place. It is highly likely that the chance of replacement is different for every word or feature ("each word has its own history", among hundreds of other sources:). That global assumption has been modified and downgraded to single words, even in single languages, in many newer attempts (see below). There is a lack of understanding of Swadesh's mathematical/statistical methods. Some linguists reject the methods in full because the statistics lead to 'probabilities' when linguists trust 'certainties' more. A serious argument is that language change arises from socio-historical events that are, of course, unforeseeable and, therefore, uncomputable. == Modifications == Somewhere in between the original concept of Swadesh and the rejection of glottochronology in its entirety lies the idea that glottochronology as a formal method of linguistic
Best AI Voice Assistants in 2026
Trying to pick the best AI voice assistant? An AI voice assistant is software that uses machine learning to help you get more done — it scales effortlessly from a single task to thousands. The best picks balance beginner-friendly simplicity with the depth power users need, and they ship updates often. Whether you are a beginner or a pro, the right AI voice assistant slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.
Theano (software)
Theano is a Python library and optimizing compiler for manipulating and evaluating mathematical expressions, especially matrix-valued ones. In Theano, computations are expressed using a NumPy-esque syntax and compiled to run efficiently on either CPU or GPU architectures. == History == Theano is an open source project primarily developed by the Montreal Institute for Learning Algorithms (MILA) at the Université de Montréal. The name of the software references the ancient philosopher Theano, long associated with the development of the golden mean. On 28 September 2017, Pascal Lamblin posted a message from Yoshua Bengio, Head of MILA: major development would cease after the 1.0 release due to competing offerings by strong industrial players. Theano 1.0.0 was then released on 15 November 2017. On 17 May 2018, Chris Fonnesbeck wrote on behalf of the PyMC development team that the PyMC developers will officially assume control of Theano maintenance once the MILA development team steps down. On 29 January 2021, they started using the name Aesara for their fork of Theano. On 29 Nov 2022, the PyMC development team announced that the PyMC developers will fork the Aesara project under the name PyTensor. == Sample code == The following code is the original Theano's example. It defines a computational graph with 2 scalars a and b of type double and an operation between them (addition) and then creates a Python function f that does the actual computation. == Examples == === Matrix Multiplication (Dot Product) === The following code demonstrates how to perform matrix multiplication using Theano, which is essential for linear algebra operations in many machine learning tasks. === Gradient Calculation === The following code uses Theano to compute the gradient of a simple operation (like a neuron) with respect to its input. This is useful in training machine learning models (backpropagation). === Building a Simple Neural Network === The following code shows how to start building a simple neural network. This is a very basic neural network with one hidden layer. === Broadcasting in Theano === The following code demonstrates how broadcasting works in Theano. Broadcasting allows operations between arrays of different shapes without needing to explicitly reshape them.
Compute (machine learning)
In machine learning and deep learning, compute is the amount of computing power or computational resources required to train machine learning models and large language models. More broadly, compute is the computational power or resources necessary for a computer or computer program to function. == Definition == Compute is commonly defined as the amount of computing power or computational resources required to train machine learning and large language models. The term "compute" has also been more broadly applied to cloud computing, referencing processing power, memory, networking, storage, and other resources required for the computation of any program. Compute is measured in petaflop/s-days and is used to document AI training. A petaflop/s-day (pfs-day) consists of performing 1015 neural net operations per second for one day, or a total of about 1020 operations. The compute-time product serves as a mental convenience, similar to kilowatt-hour for energy. An amount of compute is meant to give an idea of the number of actual operations performed. == History == In a 2018 analysis titled "AI and compute", artificial intelligence company OpenAI introduced the concept of compute. OpenAI identified two eras of training AI systems in terms of compute-usage. From 1959 to 2012, compute roughly followed Moore’s law. Between 2012 and 2018, the amount of compute used in the largest AI training runs increased exponentially, growing by more than 300,000 times — roughly doubling every 3.4 months. By comparison, Moore’s Law doubled every two years over the same period. One of the largest models, released in 2020, used 600,000 times more computing power than the 2012 model. After 2020, compute growth began to slow down, with the compute needed for the largest AI models continuing to slow down in 2023. The notion of compute has become increasingly used from the mid-2020s onwards. == Compute growth and AI progress == Larger AI models trained on more data and using more computational resources, tend to perform better. This happens even if the algorithms themselves remain unchanged. As early as 2018, OpenAI noted the exponential increase in compute to be have a key role in AI progress. OpenAI considers three factors drive the advance of AI: algorithmic innovation, data, and the amount of compute available for training. AI models with more compute not only improve in the tasks they were trained on but can develop emergent abilities. Incremental improvements can lead to more abrupt leaps in capabilities. AI provider SpaceXAI said in 2026 that their AI progress is driven by compute and used it a key metric in the AI training of its supercomputer Colossus, the which contains 1 million GPUs. Anthropic has a contract of $1.25 billion per month with SpaceXAI to buy all the compute capacity at Colossus 1 data center. === Criticism and policy === Increasing, promoting or constraining progress in artificial intelligence has often be done via controlling the amount of compute. Policymarkers have enacted policies and provided support to make compute resources more accessible to domestic AI researchers. In a January 2022 report, the Center for Security and Emerging Technology (CSET) suggested to institutions that increasingly powerful and generalizable AI (AGI) will likely require other strategies than maximizing compute. Some AI researchers are also concerned that government might exclusively focus on scaling compute instead of other strategies. The CSET has reported on the various bottlenecks which could explain why deep learning needs for compute have slow down: training is expensive and training extremely large models generates traffic jams across many processors that are difficult to manage. there is a limited supply of AI chips (see AI chip memory shortage). CSET advances that the main resource is human capital, specifically talented researchers — according to a 2023 published survey of more than 400 AI researchers, academic and private sector workers. The survey found that AI researchers are not primarily or exclusively constrained by compute access. However, both academic and industry AI researchers equally report concerns that insufficient compute could prevent them from contributing meaningfully to AI research in the future. High compute users are more concerned about compute access. When asked about which resource provided by the government would be the most useful to them, some AI researchers select compute, other prefer grant funding. For this goal, CSET advised policymakers to ensure that even researchers with smaller budgets could effectively contribute to AI research. Other proposed strategies include using contemporary AI algorithms, managing modern AI infrastructure or focusing on interdisciplinary work between the AI field and other fields of computer science. A 2024 study on compute access found that academic-only AI research teams often have less compute intensive research topics, especially foundation models, compared to industry AI labs. As a consequence, academia is likely to play a smaller role in advancing such techniques. The researchers suggest nationally-sponsored computing infrastructure as well as open science initiatives to boost academic compute access. === Data === A 2022 study found that current large language models are significantly under-trained, a consequence of focusing on scaling language models whilst keeping the amount of training data constant. By training over 400 language models of various parameter and token size, they found that "for compute-optimal training", the model size and the number of training tokens should ideally be scaled equally: for every doubling of model size the number of training tokens should also be doubled.
General Internet Corpus of Russian
General Internet Corpus of Russian (GICR) is a corpus of Russian internet texts that has been accessible on request through an online query interface since 2013. The corpus includes rich text materials from the blogosphere, social networks, major news sources and literary magazines. == Goals of the project == The project has the status of an educational and scientific one, and many tasks of computational linguistics are solved by independent researchers and research groups with the materials obtained by GICR. While other corpus projects of Russian are focused on fiction and edited texts, General Internet Corpus provides linguists timely opportunity to learn the language as it is, with all the slang and regional peculiarities. Corpus gives the opportunity to carry out research in Linguistic research of a wide range: dialectological research, study of word distribution, study of the language of the social networks, study of the influence of gender, age and other factors on the language, frequency of words, fixed expressions and different constructions, stylistic features of texts of different segments of the Internet, etc. Social media analysis Corpus-based machine learning for evaluating automatic tagging At various times, student papers and independent researches were carried out on the project material by students, graduates and employees of MSU, MIPT, Russian State Humanitarian University, Novosibirsk State University, Higher School of Economics, Russian Academy of Sciences, SFU, CSU, SGMP, IAAS of MSU. Scientific project leaders: Belikov V. - RSUH, Moscow, Russia Selegey V. - RSUH, ABBYY, Moscow, Russia Sharoff S. - RSUH, Moscow, Russia; University of Leeds, UK The organizations involved in support of GICR: Russian State University of Humanities ABBYY Company Moscow Institute of Physics and Technology Skolkovo Institute of Science and Technology == Size and content of the corpus == Corpus size for the summer 2016 is 19.8 billion tokens, of which 49% are from VKontakte, 40% are from LiveJournal, another 4% - from Mail.ru Blogs and News, and 2% - from Russian Magazine Hall. The sources collected in news segment are: RIA Novosti, Regnum, Lenta.ru, Rosbalt. Texts are provided with metamarkup (by date of creation of the text, sex, place and year of birth of the author, Internet genre, etc.); all texts are provided with automatic morphological tagging and lemmatization. Most of the texts collected are of 2013–2014 years of creation, although in some segments, such as in Russian Magazine Hall, there are some texts collected since 1994. GICR is one of the few mega-corpora projects nowadays, which means its available size is reaching several billion of words. == Access == Currently the interface of GICR is in beta stage, so access to the search in the corpora is provided and is free, but is available for researchers on request.