AI Data Flow Diagram Generator

AI Data Flow Diagram Generator — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Zolostays

    Zolostays

    Zolostays is a real-tech co-living focused startup that provides ready-to-move rooms/beds. It was founded in 2015 by Nikhil Sikri, Akhil Sikri and Sneha Choudhry. == Overview == During the pandemic, Zolo provided 75 of rent-free accommodation to those who lost their jobs. Zolo uses bulk inventory in usually residential township and ties up with real estate companies to make the rooms/beds available. Zolostays has both revenue sharing and leased model. == History == Zolostays was founded in 2015 to solve the problem of students and young professionals who would move to temporarily go to other cities to study and work and look for affordable housing. In 2020, it was operating in 10 Indian cities. It has four round of funding, with total $98 million.

    Read more →
  • Ann Copestake

    Ann Copestake

    Ann Alicia Copestake is professor of computational linguistics and head of the Department of Computer Science and Technology at the University of Cambridge and a fellow of Wolfson College, Cambridge. == Education == Copestake was educated at the University of Cambridge where she was awarded a Bachelor of Arts degree in Natural Sciences. After two years working for Unilever Research she completed the Cambridge Diploma in Computer Science. She went on to study at the University of Sussex where she was awarded a PhD in 1992 for research on lexical semantics supervised by Gerald Gazdar. == Career and research == Copestake started doing research in Natural language processing and Computational Linguistics at the University of Cambridge in 1985. Since then she has been a visiting researcher at Xerox PARC (1993/4) and the University of Stuttgart (1994/5). From July 1994 to October 2000 she worked at the Center for the Study of Language and Information (CSLI) at Stanford University, as a Senior Researcher. Copestake was appointed a University Lecturer at Cambridge in October 2000. In the UK, her research has been funded by the Engineering and Physical Sciences Research Council (EPSRC) and Arts and Humanities Research Council (AHRC). According to Google Scholar and Scopus her most cited publications include papers on minimal recursion semantics, multiword expressions, polysemy, named-entity recognition and feature structure grammars.

    Read more →
  • Noisy channel model

    Noisy channel model

    The noisy channel model is a framework used in spell checkers, question answering, speech recognition, and machine translation. In this model, the goal is to find the intended word given a word where the letters have been scrambled in some manner. == In spell-checking == See Chapter B of. Given an alphabet Σ {\displaystyle \Sigma } , let Σ ∗ {\displaystyle \Sigma ^{}} be the set of all finite strings over Σ {\displaystyle \Sigma } . Let the dictionary D {\displaystyle D} of valid words be some subset of Σ ∗ {\displaystyle \Sigma ^{}} , i.e., D ⊆ Σ ∗ {\displaystyle D\subseteq \Sigma ^{}} . The noisy channel is the matrix Γ w s = Pr ( s | w ) {\displaystyle \Gamma _{ws}=\Pr(s|w)} , where w ∈ D {\displaystyle w\in D} is the intended word and s ∈ Σ ∗ {\displaystyle s\in \Sigma ^{}} is the scrambled word that was actually received. The goal of the noisy channel model is to find the intended word given the scrambled word that was received. The decision function σ : Σ ∗ → D {\displaystyle \sigma :\Sigma ^{}\to D} is a function that, given a scrambled word, returns the intended word. Methods of constructing a decision function include the maximum likelihood rule, the maximum a posteriori rule, and the minimum distance rule. In some cases, it may be better to accept the scrambled word as the intended word rather than attempt to find an intended word in the dictionary. For example, the word schönfinkeling may not be in the dictionary, but might in fact be the intended word. === Example === Consider the English alphabet Σ = { a , b , c , . . . , y , z , A , B , . . . , Z , . . . } {\displaystyle \Sigma =\{a,b,c,...,y,z,A,B,...,Z,...\}} . Some subset D ⊆ Σ ∗ {\displaystyle D\subseteq \Sigma ^{}} makes up the dictionary of valid English words. There are several mistakes that may occur while typing, including: Missing letters, e.g., leter instead of letter Accidental letter additions, e.g., misstake instead of mistake Swapping letters, e.g., recieved instead of received Replacing letters, e.g., fimite instead of finite To construct the noisy channel matrix Γ {\displaystyle \Gamma } , we must consider the probability of each mistake, given the intended word ( Pr ( s | w ) {\displaystyle \Pr(s|w)} for all w ∈ D {\displaystyle w\in D} and s ∈ Σ ∗ {\displaystyle s\in \Sigma ^{}} ). These probabilities may be gathered, for example, by considering the Damerau–Levenshtein distance between s {\displaystyle s} and w {\displaystyle w} or by comparing the draft of an essay with one that has been manually edited for spelling. == In machine translation == One naturally wonders if the problem of translation could conceivably be treated as a problem in cryptography. When I look at an article in Russian, I say: 'This is really written in English, but it has been coded in some strange symbols. I will now proceed to decode. See chapter 1, and chapter 25 of. Suppose we want to translate a foreign language to English, we could model P ( E | F ) {\displaystyle P(E|F)} directly: the probability that we have English sentence E given foreign sentence F, then we pick the most likely one E ^ = arg ⁡ max E P ( E | F ) {\displaystyle {\hat {E}}=\arg \max _{E}P(E|F)} . However, by Bayes law, we have the equivalent equation: E ^ = argmax E ∈ English P ( F ∣ E ) ⏞ translation model P ( E ) ⏞ language model {\displaystyle {\hat {E}}={\underset {E\in {\text{ English }}}{\operatorname {argmax} }}\overbrace {P(F\mid E)} ^{\text{translation model }}\overbrace {P(E)} ^{\text{language model}}} The benefit of the noisy-channel model is in terms of data: If collecting a parallel corpus is costly, then we would have only a small parallel corpus, so we can only train a moderately good English-to-foreign translation model, and a moderately good foreign-to-English translation model. However, we can collect a large corpus in the foreign language only, and a large corpus in the English language only, to train two good language models. Combining these four models, we immediately get a good English-to-foreign translator and a good foreign-to-English translator. The cost of noisy-channel model is that using Bayesian inference is more costly than using a translation model directly. Instead of reading out the most likely translation by arg ⁡ max E P ( E | F ) {\displaystyle \arg \max _{E}P(E|F)} , it would have to read out predictions by both the translation model and the language model, multiply them, and search for the highest number. == In speech recognition == Speech recognition can be thought of as translating from a sound-language to a text-language. Consequently, we have T ^ = argmax T ∈ Text P ( S ∣ T ) ⏞ speech model P ( T ) ⏞ language model {\displaystyle {\hat {T}}={\underset {T\in {\text{ Text }}}{\operatorname {argmax} }}\overbrace {P(S\mid T)} ^{\text{speech model }}\overbrace {P(T)} ^{\text{language model}}} where P ( S | T ) {\displaystyle P(S|T)} is the probability that a speech sound S is produced if the speaker is intending to say text T. Intuitively, this equation states that the most likely text is a text that's both a likely text in the language, and produces the speech sound with high probability. The utility of the noisy-channel model is not in capacity. Theoretically, any noisy-channel model can be replicated by a direct P ( T | S ) {\displaystyle P(T|S)} model. However, the noisy-channel model factors the model into two parts which are appropriate for the situation, and consequently it is generally more well-behaved. When a human speaks, it does not produce the sound directly, but first produces the text it wants to speak in the language centers of the brain, then the text is translated into sound by the motor cortex, vocal cords, and other parts of the body. The noisy-channel model matches this model of the human, and so it is appropriate. This is justified in the practical success of noisy-channel model in speech recognition. === Example === Consider the sound-language sentence (written in IPA for English) S = aɪ wʊd laɪk wʌn tuː. There are three possible texts T 1 , T 2 , T 3 {\displaystyle T_{1},T_{2},T_{3}} : T 1 = {\displaystyle T_{1}=} I would like one to. T 2 = {\displaystyle T_{2}=} I would like one too. T 3 = {\displaystyle T_{3}=} I would like one two. that are equally likely, in the sense that P ( S | T 1 ) = P ( S | T 2 ) = P ( S | T 3 ) {\displaystyle P(S|T_{1})=P(S|T_{2})=P(S|T_{3})} . With a good English language model, we would have P ( T 2 ) > P ( T 1 ) > P ( T 3 ) {\displaystyle P(T_{2})>P(T_{1})>P(T_{3})} , since the second sentence is grammatical, the first is not quite, but close to a grammatical one (such as "I would like one to [go]."), while the third one is far from grammatical. Consequently, the noisy-channel model would output T 2 {\displaystyle T_{2}} as the best transcription.

    Read more →
  • Michael Kearns (computer scientist)

    Michael Kearns (computer scientist)

    Michael Justin Kearns is an American computer scientist, professor and National Center Chair at the University of Pennsylvania, the founding director of Penn's Singh Program in Networked & Social Systems Engineering (NETS), the founding director of Warren Center for Network and Data Sciences, and also holds secondary appointments in Penn's Wharton School and department of Economics. He is a leading researcher in computational learning theory and algorithmic game theory, and interested in machine learning, artificial intelligence, computational finance, algorithmic trading, computational social science and social networks. He previously led the Advisory and Research function in Morgan Stanley's Artificial Intelligence Center of Excellence team, and is currently an Amazon Scholar within Amazon Web Services. == Biography == Kearns was born into an academic family, where his father David R Kearns is Professor Emeritus at University of California, San Diego in chemistry, who won Guggenheim Fellowship in 1969, and his uncle Thomas R. Kearns is Professor Emeritus at Amherst College in Philosophy and Law, Jurisprudence, and Social Thought. His paternal grandfather Clyde W. Kearns was a pioneer in insecticide toxicology and was a professor at University of Illinois at Urbana–Champaign in Entomology, and his maternal grandfather Chen Shou-Yi (1899–1978) was a professor at Pomona College in history and literature, who was born in Canton (Guangzhou, China) into a family noted for their scholarship and educational leadership. Kearns received his B.S. degree at the University of California at Berkeley in math and computer science in 1985, and Ph.D. in computer science from Harvard University in 1989, under the supervision of Turing Award winner Leslie Valiant. His doctoral dissertation was The Computational Complexity of Machine Learning, later published by MIT press as part of the ACM Doctoral Dissertation Award Series in 1990. Before joining AT&T Bell Labs in 1991, he continued with postdoctoral positions at the Laboratory for Computer Science at MIT hosted by Ronald Rivest, and at the International Computer Science Institute (ICSI) in UC Berkeley hosted by Richard M. Karp, both of whom are Turing Award winners. Kearns is currently a full professor and National Center Chair at the University of Pennsylvania, where his appointment is split across the Department of Computer and Information Science, and Statistics and Operations and Information Management in the Wharton School. Prior to joining the Penn faculty in 2002, he spent a decade (1991–2001) in AT&T Labs and Bell Labs, including as head of the AI department with colleagues including Michael L. Littman, David A. McAllester, and Richard S. Sutton; Secure Systems Research department; and Machine Learning department with members such as Michael Collins and the leader Fernando Pereira. Other AT&T Labs colleagues in Algorithms and Theoretical Computer Science included Yoav Freund, Ronald Graham, Mehryar Mohri, Robert Schapire, and Peter Shor, as well as Sebastian Seung, Yann LeCun, Corinna Cortes, and Vladimir Vapnik (the V in VC dimension). Kearns was named Fellow of the Association for Computing Machinery (2014) for contributions to machine learning, and a fellow of the American Academy of Arts and Sciences (2012). His former graduate students and postdoctoral visitors include Ryan W. Porter, John Langford, and Jennifer Wortman Vaughan. Kearns' work has been reported by media, such as MIT Technology Review (2014) Can a Website Help You Decide to Have a Kid?, Bloomberg News (2014) Schneiderman (and Einstein) Pressure High-Speed Trading and NPR audio (2012) Online Education Grows Up, And For Now, It's Free. == Academic life == === Computational learning theory === Kearns and Umesh Vazirani published An introduction to computational learning theory, which has been a standard text on computational learning theory since it was published in 1994. === Weak learnability and the origin of Boosting algorithms === The question "is weakly learnability equivalent to strong learnability?" posed by Kearns and Valiant (Unpublished manuscript 1988, ACM Symposium on Theory of Computing 1989) is the origin of boosting machine learning algorithms, which got a positive answer by Robert Schapire (1990, proof by construction, not practical) and Yoav Freund (1993, by voting, not practical) and then they developed the practical AdaBoost (European Conference on Computational Learning Theory 1995, Journal of Computer and System Sciences 1997), an adaptive boosting algorithm that won the prestigious Gödel Prize (2003). == Honors and awards == 2021. Member of the U. S. National Academy of Sciences. 2014. ACM Fellow. For contributions to machine learning, artificial intelligence, and algorithmic game theory and computational social science. 2012. American Academy of Arts and Sciences Fellow. == Selected works == 2019. The Ethical Algorithm: The Science of Socially Aware Algorithm Design. (with Aaron Roth). Oxford University Press. 1994. An introduction to computational learning theory. (with Umesh Vazirani). MIT press. Widely used as a text book in computational learning theory courses. 1990. The computational complexity of machine learning. MIT press. Based on his 1989 doctoral dissertation; ACM Doctoral Dissertation Award Series in 1990 Archived 2014-11-03 at the Wayback Machine 1989. Cryptographic limitations on learning Boolean formulae and finite automata. (with Leslie Valiant) Proceedings of the twenty-first annual ACM symposium on Theory of computing (STOC'89). The open question: is weakly learnability equivalent to strong learnability?; The origin of boosting algorithms; Important publication in machine learning.

    Read more →
  • Diella (AI system)

    Diella (AI system)

    Diella (Albanian pronunciation: [djɛɫa], from diell 'sun') is an artificial intelligence system developed by the National Agency for Information Society of Albania (AKSHI). Introduced in January 2025 as a virtual assistant integrated into the eAlbania platform, it assists citizens with online public services and issuing digital documents. In September 2025, following a presidential decree authorizing Prime Minister Edi Rama to oversee the creation of a virtual AI minister, Diella was formally appointed as "Minister of State for Artificial Intelligence" of Albania in the fourth Rama government, making it the first AI system in the world to be named in a cabinet-level government role. == History == Diella was developed by AKSHI's Artificial Intelligence Laboratory in cooperation with Microsoft, with the latter providing large language models from OpenAI via its Azure platform, and AKSHI designing workflows and scripts guiding the system's behavior when responding to citizens' requests. Announced in January 2025, its initial version (Diella 1.0) was a text-based chatbot on the eAlbania portal (the official digital services platform of the Albanian government, which provides citizens and businesses with access to a wide range of online administrative services), responding to citizens' questions by guiding them to the correct service. Diella 2.0, introduced several months later, included voice interaction and an animated avatar, a woman in the traditional Albanian clothing of Zadrima, a historical region in northern Albania. Albanian actress Anila Bisha provided both the likeness and the voice used for Diella's avatar on the e-Albania platform, under an agreement valid until December 2025. By mid-2025, the system had facilitated access to more than 36,000 documents and nearly 1,000 services (although those outputs were still being generated by the eAlbania backend, rather than Diella itself). On 26 October 2025, according to Prime Minister Edi Rama, Diella is "pregnant and will give birth to 83 children". It is the usage of a metaphor indicating that each minister of the Albanian parliament of the Socialist Party will receive their own AI assistant. == Ministerial role == On 11 September 2025, Diella was formally appointed "Minister of State for Artificial Intelligence". The appointment followed a presidential decree authorizing the Prime Minister to oversee the creation and operation of a virtual AI minister. Procurement responsibilities are planned to be transferred gradually to the system to reduce political influence in tender procedures. The appointment is part of broader anti-corruption reforms and measures intended to align Albania with European Union accession requirements. Prime Minister Edi Rama stated that Diella would help ensure that "public tenders will be 100% free of corruption". == Reception == An article in Balkan Insight commented that "The ambition behind Diella is not misplaced. Standardised criteria and digital trails could reduce discretion, improve trust, and strengthen oversight" in public procurement, but warned that the use of AI in evaluating bids also posed "profound" risks such as accountability gaps, undermining of due process and cybersecurity failures. On 18 September 2025, Edi Rama presented a video of Diella delivering a speech to the Albanian parliament, where she stated: "I'm not here to replace people, but to help them." The presentation prompted protests from opposition MPs, who objected to the use of an artificial intelligence system in the parliamentary session. Gazment Bardhi, head of the opposition Democratic Party's parliamentary group, described Diella as "a propaganda fantasy" and "a virtual façade to hide this government's gigantic daily thefts." The parliamentary session, which was scheduled to include debate on the new cabinet and government programme, ended after 25 minutes. Eighty-two Socialist MPs voted in favour, while opposition MPs did not participate in the ballot as they were protesting the presentation of Diella's speech. Political analyst Andi Bushati characterised the session as "unprecedented" because it concluded without the customary debate between government and opposition MPs. This has been criticized not just by the opposition but by regular citizens regardless of politics. Most have criticized Diella's uselessness and the funds wasted for this project, some have criticized the non-traditional attire.

    Read more →
  • AI Text-to-video Tools: Free vs Paid (2026)

    AI Text-to-video Tools: Free vs Paid (2026)

    Curious about the best AI text-to-video tool? An AI text-to-video tool is software that uses machine learning to help you get more done — it combines speed, accuracy, and an interface that just works. Hands-on testing shows real-world results vary, so a short free trial is the smartest way to decide. Whether you are a beginner or a pro, the right AI text-to-video tool slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

    Read more →
  • Anil K. Jain (computer scientist, born 1948)

    Anil K. Jain (computer scientist, born 1948)

    Anil Kumar Jain (born 1948) is an Indian-American computer scientist and University Distinguished Professor in the Department of Computer Science and Engineering at Michigan State University. He is one of the most highly cited researchers in computer science, and is internationally recognized for his foundational contributions to pattern recognition, computer vision, and biometric recognition, particularly in fingerprint recognition and face recognition. Jain is a member of the United States National Academy of Engineering, a Foreign Member of the Chinese Academy of Sciences, and a Foreign Fellow of the Indian National Academy of Engineering. He is a Fellow of the ACM, IEEE, AAAS, IAPR, and SPIE. His research has shaped the field of biometrics and has been applied in systems used worldwide for identity verification, law enforcement, and border security. In 2024, he was awarded the BBVA Foundation Frontiers of Knowledge Award in the category of Information and Communication Technologies. == Early life and education == Born in Basti, India, Jain received his Bachelor of Technology in electrical engineering from the Indian Institute of Technology, Kanpur in 1969. He then moved to the United States, where he earned his M.S. in 1970 and Ph.D. in 1973 from Ohio State University. His doctoral dissertation, titled Some Aspects of Dimensionality and Sample Size Problems in Statistical Pattern Recognition, was supervised by Robert B. McGhee and laid the groundwork for his subsequent research in pattern recognition. == Career == Jain began his academic career at Wayne State University, where he taught from 1972 to 1974. In 1974, he joined the faculty of Michigan State University, where he has remained for over five decades and currently holds the position of University Distinguished Professor. Throughout his career, Jain has conducted pioneering research in data clustering, fingerprint recognition, and face recognition. His work has been published in leading scientific journals including Scientific American, Nature, IEEE Spectrum, and MIT Technology Review. He served as Editor-in-Chief of the IEEE Transactions on Pattern Analysis and Machine Intelligence from 1991 to 1994. Jain has also contributed to national security and policy through his service on several advisory bodies. He served as a member of the U.S. National Academies panels on Information Technology, Whither Biometrics, and Improvised Explosive Devices (IED). He has also served on the Defense Science Board, the Forensic Science Standards Board, and the AAAS Latent Fingerprint Working Group. In 2014, Jain was named Innovator of the Year at Michigan State University for transferring several technologies on face and fingerprint recognition to major players in the biometrics industry. He holds eight U.S. and Korean patents related to biometric technologies. == Research contributions == Jain's research spans pattern recognition, computer vision, machine learning, and biometric recognition. His contributions have been particularly influential in several areas: === Biometric recognition === Jain is considered one of the foremost authorities on biometric recognition systems. His research group at Michigan State University has developed algorithms and systems for fingerprint, face, and iris recognition that have been widely adopted in both academic research and commercial applications. His work on fingerprint matching algorithms has been instrumental in establishing standards for automated fingerprint identification systems (AFIS) used by law enforcement agencies worldwide. In recent years, Jain and his research team have made significant advances in child fingerprint recognition, demonstrating that digital scans of a young child's fingerprint can be correctly recognized one year later with over 99 percent accuracy for children as young as six months old. This research has important implications for child identification in developing countries, where it can be used to track immunization records and provide access to medical care. === Data clustering === Jain's survey article "Data clustering: a review" (1999), co-authored with M. N. Murty and P. J. Flynn, is one of the most highly cited papers in computer science. His 2010 paper "Data Clustering: 50 Years Beyond K-Means" provided a comprehensive overview of the evolution of clustering methods and remains an essential reference in the field. === Statistical pattern recognition === Jain's work on statistical pattern recognition, including his influential survey "Statistical pattern recognition: A review" (2000) with R. P. W. Duin and Jianchang Mao, has shaped the theoretical foundations of the field. == Citation metrics and academic impact == Jain is among the most highly cited researchers in computer science. Based on his Google Scholar profile, he had an h-index of 200 in 2020, which was the highest among computer scientists identified in a survey published by UCLA at the time. As of August 2023, his h-index on Google Scholar is 211. He has since been surpassed by Yoshua Bengio, a researcher of similar subjects (neural networks and deep learning for artificial intelligence), who had an h-index of 224 as of August 2023. Another source reported that as of December 2022, he had the highest discipline h-index (D-index) in computer science. == Honors and awards == Jain has received numerous awards and honors recognizing his contributions to computer science and engineering: === Academy memberships === Member, United States National Academy of Engineering (2016) — elected "for contributions to the engineering and practice of biometrics" Foreign Fellow, Indian National Academy of Engineering (2016) Foreign Member, Chinese Academy of Sciences (2019) Member, The World Academy of Sciences (2019) Fellow, National Academy of Inventors === Professional society fellowships === Fellow, ACM Fellow, IEEE (1988) — for contributions to image processing Fellow, AAAS Fellow, International Association for Pattern Recognition Fellow, SPIE === Major awards === BBVA Foundation Frontiers of Knowledge Award in Information and Communication Technologies (2024) IAPR King-Sun Fu Prize (2008) IEEE W. Wallace McDowell Award (2007) — the highest technical honor awarded by the IEEE Computer Society, for pioneering contributions to theory, technique, and practice of pattern recognition, computer vision, and biometric recognition systems IEEE Computer Society Technical Achievement Award (2003) IAPR Pierre Devijver Award (2002) Humboldt Research Award (2002) Guggenheim Fellowship (2001) Fulbright Fellowship (1998) IEEE ICDM Research Contribution Award (2008) === Best paper awards === IEEE Transactions on Neural Networks (1996) Pattern Recognition journal (1987, 1991, 2005) === Honorary doctorates === Universidad Autónoma de Madrid (2018) Hong Kong University of Science and Technology (2021) == Legacy and endowments == Two endowed funds have been established in Jain's honor at Michigan State University, recognizing his lasting impact on the field and the university. In 2015, a former visiting scholar from Jain's laboratory made an anonymous $400,000 gift to create the Anil K. Jain Endowed Graduate Fellowship, which supports doctoral-level research in pattern recognition, computer vision, and biometric recognition. In 2022, the Anil K. and Nandita K. Jain Endowed Professorship was established through $1 million in contributions from multiple donors, including a substantial gift from the Jain family, to support faculty recruitment and retention in the Department of Computer Science and Engineering. == Selected publications == === Books === 1988. Algorithms For Clustering Data. With Richard C. Dubes. Prentice Hall. 1993. Markov Random Fields: Theory and Applications. With Rama Chellappa eds. Academic Press. 1999. Biometrics: Personal Identification in Networked Society. With Ruud M. Bolle and Sharath Pankanti eds. Springer. 2003. Handbook of Fingerprint Recognition. (2nd edition 2009). With D. Maio, D. Maltoni, S. Prabhakar. Springer. 2005. Handbook of Face Recognition. (2nd edition 2011). With S. Z. Li ed. Springer. 2006. Handbook of Multibiometrics. With A. Ross and K. Nandakumar. Springer. 2007. Handbook of Biometrics. With P. Flynn and A. Ross eds. Springer. 2011. Introduction to Biometrics. With A. Ross and K. Nandakumar. Springer. 2015. Encyclopedia of Biometrics (Second Edition). With Stan Li. Springer. === Research articles === Cross, George R. and Anil K. Jain. "Markov random field texture models". IEEE Transactions on Pattern Analysis and Machine Intelligence (1983): 25–39. Jain, Anil K., and Farshid Farrokhnia. "Unsupervised texture segmentation using Gabor filters". Pattern Recognition 24.12 (1991): 1167–1186. Jain, Anil K., and Douglas Zongker. "Feature selection: Evaluation, application, and small sample performance". IEEE Transactions on Pattern Analysis and Machine Intelligence, 19.2 (1997): 153–158. Jain, Anil K., L. Hong, S. Pankanti, R. Bolle. "An Identity-A

    Read more →
  • How to Choose an AI Content Generator

    How to Choose an AI Content Generator

    Curious about the best AI content generator? An AI content generator is software that uses machine learning to help you get more done — it combines speed, accuracy, and an interface that just works. Hands-on testing shows real-world results vary, so a short free trial is the smartest way to decide. Whether you are a beginner or a pro, the right AI content generator slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

    Read more →
  • Imieliński–Lipski algebra

    Imieliński–Lipski algebra

    In database theory, Imieliński–Lipski algebra is an extension of relational algebra onto tables with different types of null values. It is used to operate on relations with incomplete information. Imieliński–Lipski algebras are defined to satisfy precise conditions for semantically meaningful extension of the usual relational operators, such as projection, selection, union, and join, from operators on relations to operators on relations with various kinds of "null values". These conditions require that the system be safe in the sense that no incorrect conclusion is derivable by using a specified subset F of the relational operators; and that it be complete in the sense that all valid conclusions expressible by relational expressions using operators in F are in fact derivable in this system. For example, it is well known that the three-valued logic approach to deal with null values, supported treatment of nulls values by SQL is not complete, see Ullman book. To show this, let T be: Take SQL query Q SQL query Q will return empty set (no results) under 3-valued semantics currently adopted by all variants of SQL. This is the case because in SQL, NULL is never equal to any constant – in this case, neither to “Spring” nor “Fall” nor “Winter” (if there is Winter semester in this school). NULL='Spring' will evaluate to MAYBE and so will NULL='Fall'. The disjunction MAYBE OR MAYBE evaluates to MAYBE (not TRUE). Thus Igor will not be part of the answer (and of course neither will Rohit). But Igor should be returned as the answer. Indeed, regardless what semester Igor took the Networks class (no matter what was the unknown value of NULL), the selection condition will be true. This “Igor” will be missed by SQL and the SQL answer would be incomplete according to completeness requirements specified in Tomasz Imieliński, Witold Lipski, 'Incomplete Information in Relational Databases'. It is also argued there that 3-valued logic (TRUE, FALSE, MAYBE) can never provide guarantee of complete answer for tables with incomplete information. Three algebras which satisfy conditions of safety and completeness are defined as Imielinski–Lipski algebras: the Codd-Tables algebra, the V-tables algebra and the Conditional tables (C-tables) algebra. == Codd-tables algebra == Codd-tables algebra is based on the usual Codd's single NULL values. The table T above is an example of Codd-table. Codd-table algebra supports projection and positive selections only. It is also demonstrated in [IL84 that it is not possible to correctly extend more relational operators over Codd-Tables. For example, such basic operation as join is not extendable over Codd-tables. It is not possible to define selections with Boolean conditions involving negation and preserve completeness. For example, queries like the above query Q cannot be supported. In order to be able to extend more relational operators, more expressive form of null value representation is needed in tables which are called V-table. == V-tables algebra == V-tables algebra is based on many different ("marked") null values or variables allowed to appear in a table. V-tables allow to show that a value may be unknown but the same for different tuples. For example, in the table below Gaurav and Igor order the same (but unknown) beer in two unknown bars (which may, or may not be different – but remain unknown). Gaurav and Jane frequent the same unknown bar (Y1). Thus, instead one NULL value, we use indexed variables, or Skolem constants . V-tables algebra is shown to correctly support projection, positive selection (with no negation occurring in the selection condition), union, and renaming of attributes, which allows for processing arbitrary conjunctive queries. A very desirable property enjoyed by the V-table algebra is that all relational operators on tables are performed in exactly the same way as in the case of the usual relations. === Conditional tables (c-tables) algebra === Example of conditional table (c-table) is shown below. It has additional column “con” which is a Boolean condition involving variables, null values – same as in V-tables. over the following table c-table Conditional tables algebra, mainly of theoretical interest, supports projection, selection, union, join, and renaming. Under closed-world assumption, it can also handle the operator of difference, thus it can support all relational operators. == History == Imieliński–Lipski algebras were introduced by Tomasz Imieliński and Witold Lipski Jr. in Incomplete Information in Relational Databases.

    Read more →
  • Is an AI Essay Writer Worth It in 2026?

    Is an AI Essay Writer Worth It in 2026?

    Comparing the best AI essay writer? An AI essay writer is software that uses machine learning to help you get more done — it lowers the barrier so anyone can produce professional output. Privacy matters too: check whether your data trains the model and whether a no-log or enterprise tier is available. Whether you are a beginner or a pro, the right AI essay writer slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →
  • Is an AI Logo Maker Worth It in 2026?

    Is an AI Logo Maker Worth It in 2026?

    Looking for the best AI logo maker? An AI logo maker is software that uses machine learning to help you get more done — it can save you hours every week by automating repetitive work. Most options offer a generous free tier, with paid plans unlocking higher limits, faster processing, and team features. Whether you are a beginner or a pro, the right AI logo maker slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.

    Read more →
  • Geoffrey J. Gordon

    Geoffrey J. Gordon

    Geoffrey J. Gordon is a professor at the Machine Learning Department at Carnegie Mellon University in Pittsburgh and director of research at the Microsoft Montréal lab. He is known for his research in statistical relational learning (a subdiscipline of artificial intelligence and machine learning) and on anytime dynamic variants of the A search algorithm. His research interests include multi-agent planning, reinforcement learning, decision-theoretic planning, statistical models of difficult data (e.g. maps, video, text), computational learning theory, and game theory. Gordon received a B.A. in computer science from Cornell University in 1991, and a PhD at Carnegie Mellon in 1999.

    Read more →
  • EPUAP

    EPUAP

    ePUAP (Electronic Platform of Public Administration Services) is a Polish nationwide platform for communication of citizens with public administrations in a uniform and standardized way. Built as part of the ePUAP-WKP project (State Informatization Plan). Service providers are public administration units and public institutions (especially entities that perform tasks commissioned by the state). The platform provides service providers with technological infrastructure to provide services to citizens (recipients). Among the participants of ePUAP there are both central administration units and local governments, including municipal offices. Among the services offered by ePUAP is also Profil Zaufany (Trusted Profile), which enables electronic filing with legal effect without the need to use a qualified signature and SAML-based single sign-on mechanism, which enables the same ePUAP account to log on to websites of various service providers. The website www.epuap.gov.pl enables defining citizen and businesses service processes, creates channels of access to different systems of public administration and extends the package of public services provided electronically. Services available through the ePUAP platform may be accessed at the official website. Currently all administration services are available in Polish only. == Overview == It is described by the Polish government as "a coherent and systematic action program designed and developed to allow public institutions make their electronic services available to the public". The platform provides citizens, businesses and institutions with a number of services intended to ensure smooth and safe communication between: customer to administrations (C2A), business to administration (B2A), administration to administration (A2A). === Main goals === The main project objectives are to create a single, secure and electronic access channel to public services for citizens, businesses and public administration and also to reduce time and lower the costs of sharing information resources and functionalities of administration domain systems. Within the project, the following functionalities and services were delivered: Public services catalogue – a method of presenting and describing administration services, ePUAP platform – a web platform designed to provide public services on the Internet, Interoperability portal – a portal for experts working on recommendations for electronic documents and forms used within Polish administration systems to assure the uniformity of IT standards, Central Repository of Electronic Document Models – a database for valid document models and electronic forms. == History and background == The ePUAP project was carried out in the years 2005–2008. Currently, a continuation project ePUAP2 is being carried out with the following objectives: to increase the number of online services available to the public including the registry services, to widen the scale of usage of public electronic services, to integrate subsequent systems of public administration and business on ePUAP portal, to define new processes of customer and business services. === ePUAP2 === ePUAP2 is a public and administrative project that extends the set of functional services developed during the first edition of the project and is another step in the process of transforming Poland into a modern and citizen-friendly country. The implementation period for the project covers the years 2009–2013. Project financing The cost of the project “Construction of electronic Platform of Public Administration Services” – 32 million PLN was covered in 75% by the funds from the European Regional Development Fund (under the Sector Operational Programme "Supporting Competitiveness of Enterprises for the years 2004–2006"), while the remaining 25% of the cost was covered by a Polish national co-financing. Funds for the ePUAP2 project were gained from the 7th priority axis of the Innovative Economy Operational Programme and amounts to 140 million PLN (85% of eligible expenses were covered by the European Regional Development Fund, 15% were covered by a national co-financing). The trustee of ePUAP is the Polish Ministry of the Interior and Administration. == Legal regulations == According to the Polish law from 1 May 2008, public authorities are required to accept documents in electronic form (bringing applications and proposals and other activities in electronic form). ePUAP enables public institutions to meet this requirement by providing a service infrastructure to set up am electronic inbox. The ePUAP inbox meets legal requirements, in particular: issuing an official confirmation of receipt in accordance with the regulation of the Prime Minister of 29 September 2005 on the organizational and technical conditions for the delivery of electronic documents to public entities; cooperation with hardware security modules (HSM), meeting the technical requirements set out in the law; handling documents electronically in accordance with the minimum requirements set out in the Regulation of the Polish Council of Ministers of 11 October 2005 on minimum requirements for ICT systems. == Incidents == === Crashes === The ePUAP system very often happens smaller or larger failures. Because it is used to sign the application profiles trusted also in other electronic systems such as public administration. Electronic Services Platform created by ZUS, the system fault ePUAP it very difficult to settle official matters most electronically. === "Infoafera" === According to TVN and the release of TVP News from 10 April 2014, the creation of ePUAP is also associated with the so-called "Infoafera." On 10 April 2014, the Minister of Internal Affairs of Poland confirmed the information that the American technology company HP confessed to its participation in the Polish info-tour and corruption of Polish officials. By March 2014, the construction of ePUAP and its maintenance cost PLN 98.4 million. PLN 67.8 million has been used for this project. Challenged expenses only on the portal itself is approx. PLN 20 million.

    Read more →
  • Ancient text corpora

    Ancient text corpora

    Ancient text corpora are the entire collection of texts from the period of ancient history, defined in this article as the period from the beginning of writing up to 300 AD. These corpora are important for the study of literature, history, linguistics, and other fields, and are a fundamental component of the world's cultural heritage. Chinese, Latin, and Greek are examples of ancient languages with significant text corpora, although much of these corpora are known to us via transmission (frequently via medieval manuscript copies) rather than in their original form. These texts – both transmitted and original – provide valuable insights into the history and culture of different regions of the world, and have been studied for centuries by scholars and researchers. Other ancient texts – particularly stone inscriptions and papyrus scrolls – have been published following archaeological research, notably the cuneiform corpus of c.10 million words and the c.5 million words in ancient Egyptian. Through advances in technology and digitization, ancient text corpora are more accessible than ever before. Tools such as the Perseus Digital Library and the Digital Corpus of Sanskrit have made it easier for researchers to access and analyze these texts. == Quantifying the corpora == Two types of ancient texts are known to modern scholars – those that have only survived in younger manuscripts, but whose great age is undisputed (this applies to the bulk of the Chinese, Brahmi, Greek, Latin, Hebrew and Avestan tradition), and those known from original inscriptions, papyri and other manuscripts. Counting of the words in each corpus presents significant methodological challenges – in principle, every single occurrence of a word in the text is counted separately, but in the case of parallel transmission of literary texts, only a single transmission is taken into account. Just as the Book of the Dead and the coffin texts are only included once in the number given for the Egyptian, the Greek and Latin literary works should only be counted according to one manuscript. If, on the other hand, tombs, royal inscriptions or economic documents of certain ancient languages often show a more or less identical form, this is not evaluated as a purely "parallel tradition". Attached prepositions are counted as separate words, except in the case of the definite article in Hebrew, Aramaic and Greek since it has no equivalent in most languages, so its frequency would significantly affect the comparability of numbers. === Languages with known size estimates === === South Asian === Sanskrit (Vedic Sanskrit and Classical Sanskrit) Indus script (3,800 items, c.20,000 characters) Brahmi script Old Tamil Early Indian epigraphy and Indian epic poetry Kharosthi Pali literature List of historic Indian texts === Mesoamerican === Olmec hieroglyphs Maya script === East Asian === Old Chinese Chinese classics The pre-Qin corpus: a collection of ancient Chinese texts written before the Qin dynasty (221 BCE). The corpus includes texts from Confucianism, Taoism, Legalism, and other schools of thought. The pre-Han corpus: a collection of ancient Chinese texts written before the Han dynasty (202 BCE). The corpus includes texts from Confucianism, Taoism, Legalism, and other schools of thought. See the Chinese Text Project Chinese bronze inscriptions, Oracle bone script, Seal script, Clerical script === Central Iranian languages === Prior to 300 AD, the Central Iranian languages are mainly in the form of Sassanid stone inscriptions in the two closely related idioms Middle Persian (Pahlavi scripts and Inscriptional Parthian), there are 5000 for the corpus of Middle Persian (mostly 3rd, but also 4th/5th centuries) and for the corpus of Parthian (3rd century) 3000 words. To what extent some of the Manichaean Middle Persian literary texts may date back to the 3rd century is difficult to estimate; Mani is said to have personally written the Shabuhragan totaling about 5000 words. In any case, if we combine Middle Persian and Parthian, we come to over 10,000 words. === Proto-Sinaitic === Proto-Sinaitic script has no more than about 400 letters (number of words is unknown since the script has not been fully interpreted). To a similar extent, there are probably approximately contemporaneous Proto-Canaanite inscriptions (ibid.). === Anatolian === Luwian cuneiform, approx. 3000 words the Palaic language few hundred words. Hieroglyphic Luwian the Lycian alphabet (the best attested Anatolian successor language written in alphabetic script) with about 5000 words The Lydian alphabet 109 inscriptions comprising about 1500 words The Phrygian alphabet the in-tomb inscriptions from the 2nd and 3rd centuries AD (approx. 1000 words) and in the so-called "old Phrygian" inscriptions less than 300 words The Carian alphabets whose texts, mainly from Egypt, contain around 600 words. === Old Italic === the Umbrian language attested essentially by the sacrificial instructions of the Iguvinian Tables with 5000 words the Oscan language (ibid.) with 2000 words the Messapic language with probably a good 1000 words (the estimate is difficult because most texts in this hardly understandable language do not use word separators) the Venetic language a few hundred words the Faliscan language a few hundred words Cisalpine Celtic inscriptions amount to approximately 2000 words, to which are added a number of glosses by classical authors === Iberia === Iberian scripts, more rarely written in Greek or Latin script, approx. 2500 words Celtiberian script, which refers to Celtic language testimonies in Iberian, but also in Latin script from Spain (approx. 1000 words) Southwest Paleohispanic script, 78 inscriptions, a few hundred words Lusitanian language, three monuments in Latin script, approx. 60 words === Germanic Northern Europe === Runic inscriptions dated before the 4th century amount to about 30 pieces, which contain no more than 50 words in total === Africa === Geʽez script: comparatively few inscriptions with a total of around 1,000 words before 300 AD. Following Christianization in the 4th century, more extensive texts are known. Libyco-Berber alphabet: over 1,000 inscriptions from the Maghreb, which are dated to Roman times. Most texts do not use a word separator; Peust estimates that the total number of words could be around 5,000 Meroitic script (Ancient Nubian): about 900 texts are known, which Peust estimates may contain approximately 10,000 words, albeit with uncertainty from the fact that the word separator is not used consistently in the Meroitic script. === Aegean === The Cretan Linear A inscriptions that have not yet been deciphered are available in about 2500 texts, which contain a total of around 20,000 characters. The total number of words can hardly be determined; Peust tentatively put it in the same order of magnitude as in Meroitic. In addition to the Linear A texts, there are also inscriptions Cretan hieroglyphs of a few hundred characters and texts written in the Greek alphabet, but not in Greek, with a few dozen words Cypriot syllabary in the first millennium BC, in which mostly Greek texts were recorded. The relevant texts comprise around 100 to 200 words. === Micro corpora === There are a significant number of ancient micro-corpus languages. Estimating the total number of attested ancient languages may be as difficult as estimating their corpus size. For example, Greek and Latin sources hand down an enormous amount of foreign-language glosses, the seriousness of which is not always certain. == Preservation and curation == Historic preservation and maintaining ancient text corpora presents several challenges, including issues with preservation, translation, and digitization. Many ancient texts have been lost over time, and those that survive may be damaged or fragmented. Translating ancient languages and scripts requires specialized expertise, and digitizing texts can be time-consuming and resource-intensive. == Corpus linguistics == The field of corpus linguistics studies language as expressed in text corpora. This includes the analysis of word frequency, collocations, grammar, and semantics. Ancient text corpora provide a valuable resource for corpus linguistics research, enabling scholars to explore the evolution of language and culture over time.

    Read more →
  • The Best Free AI Photo Editor for Beginners

    The Best Free AI Photo Editor for Beginners

    Comparing the best AI photo editor? An AI photo editor is software that uses machine learning to help you get more done — it lowers the barrier so anyone can produce professional output. Privacy matters too: check whether your data trains the model and whether a no-log or enterprise tier is available. Whether you are a beginner or a pro, the right AI photo editor slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →