AI Data Visualization Tools

AI Data Visualization Tools — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Artificial intelligence in education

    Artificial intelligence in education

    Artificial intelligence in education (often abbreviated as AIEd) is a subfield of educational technology that studies how to use artificial intelligence to create learning environments. Considerations in the field include data-driven decision-making, AI ethics, data privacy and AI literacy. Concerns include the potential for cheating, over-reliance, equity of access, reduced critical thinking, and the perpetuation of misinformation and bias. == History == Efforts to integrate AI into educational contexts have often followed technological advancement in the history of artificial intelligence. In the 1960s, educators and researchers began developing computer-based instruction systems, such as PLATO, developed by the University of Illinois. In the 1970s and 1980s, intelligent tutoring systems (ITS) were being adapted for classroom instruction. The International Artificial Intelligence in Education Society was founded in 1993. Coinciding with the AI boom of the 2020s, the use of large language models in the global north has been promoted and funded by venture capital and big tech. Companies creating AI services have targeted students and educational institutions as customers. Similarly, pre-AI boom educational companies have expanded their use of AI technologies. These commercial incentives for AIEd use may be related to a potential AI bubble. In the U.S., bipartisan support of AI development in K-12 education has been expressed, but specific implementations and best practices remain contentious. == Theory == AIEd applies theory from education studies, machine learning, and related fields. A 2019 review of the previous decade of studies found that most research prioritized technological design over pedagogical integration. Ouyang and Jiao (2021) propose three paradigms for AI in education, which follow roughly from least to most learner-centered and from requiring least to most technical complexity from the AI systems: AI-directed, learner-as-recipient: AIEd systems present a pre-set curriculum based on statistical patterns that do not adjust to learner's feedback. AI-supported, learner-as-collaborator: Systems that incorporate responsiveness to learner's feedback through, for example, natural language processing, wherein AI can support knowledge construction. AI-empowered, learner-as-leader: This model seeks to position AI as a supplement to human intelligence wherein learners take agency and AI provides consistent and actionable feedback. Some scholars place AI in education within a socio-technical framework. This positions AI alongside other emerging educational technologies, such as computing, the internet, and social media. The framework of Tsao, Heinrichs and Camit (2025) draws on new materialism and posthumanism, specifically Donna Haraway's concept of sympoiesis (making-with). This perspective views learning as an entanglement of human and non-human actors (students, teachers, and AI algorithms), where knowledge is co-composed in contact zones between human context and algorithmic prediction. AI agents have been trained on biased datasets, and thus continue to perpetuate societal biases. Since LLMs were created to produce human-like text, algorithmic bias can be introduced and reproduced. AI's data processing and monitoring reinforce neoliberal approaches to education rather than addressing inequalities. == Applications == Uses of generative AI chatbots in education have included assessment and feedback, machine translations, proof-reading exam question generation and copy editing, or as virtual assistants. Emotional AI in education is the study and development of systems that can detect learners' emotions or provide emotional support in learning. == Usage == === Schools and educators === Following the release of ChatGPT in November 2022, some schools and large school districts blocked access to the site and issued warnings that the use of such tools would be seen as cheating. Governmental and non-governmental organizations such as UNESCO, Article 4 of the European Union's AI Act, and the U.S. Department of Education have published reports advocating for specific AIEd approaches. National higher-education bodies have also published guidance on generative AI, including Ireland's Higher Education Authority, which issued a policy framework for higher education teaching and learning in December 2025. In 2024, UNESCO released updated global guidance for generative AI in education, emphasizing ethical use, teacher training, and data protection to ensure responsible integration of AI tools in learning environments. According to Taso (2025), policy implementation in higher education is interpreted and enacted differently by various organizations. These decentralized policies can lead to inconsistent enforcement and confusion among students regarding what constitutes acceptable use, with the burden of ethical navigation falling on individual teachers and students. AI integration in classrooms has created new forms of invisible labour for educators, who must navigate ambiguous policies, redesign assessments to be AI-resilient, and adjudicate potential academic integrity violations. The use of AI detection tools has also been criticised for creating an adversarial relationship between students and institutions, where students may be falsely accused of misconduct based on probabilistic software. AIEd advocates say that efforts should be made towards increasing global accessibility and training educators to serve underprivileged areas. === Students === Reliance on generative AI has been linked with reduced academic self-esteem and performance, and heightened learned helplessness. Algorithm errors and hallucinations are common flaws in AI agents, making them less trustworthy and reliable. According to a 2025 survey from Inside Higher Ed, 85% of higher education students use generative AI technology in some way, with 25% using AI to complete assignments for them. The most common reason cited for using AI to cheat was pressure to get high grades. 97% of students wanted some form of action from schools on the threat to academic integrity caused by AI, with the most popular options being clearer policies and more education about ethical uses of AI. In September 2025, The Atlantic published an op-ed from a high school senior arguing that the normalization of AI cheating was eroding critical thinking, academic integrity, creativity, and the shared student experience.

    Read more →
  • Birkhoff algorithm

    Birkhoff algorithm

    Birkhoff's algorithm (also called Birkhoff-von-Neumann algorithm) is an algorithm for decomposing a bistochastic matrix into a convex combination of permutation matrices. It was published by Garrett Birkhoff in 1946. It has many applications. One such application is for the problem of fair random assignment: given a randomized allocation of items, Birkhoff's algorithm can decompose it into a lottery on deterministic allocations. == Terminology == A bistochastic matrix (also called: doubly-stochastic) is a matrix in which all elements are greater than or equal to 0 and the sum of the elements in each row and column equals 1. An example is the following 3-by-3 matrix: ( 0.2 0.3 0.5 0.6 0.2 0.2 0.2 0.5 0.3 ) {\displaystyle {\begin{pmatrix}0.2&0.3&0.5\\0.6&0.2&0.2\\0.2&0.5&0.3\end{pmatrix}}} A permutation matrix is a special case of a bistochastic matrix, in which each element is either 0 or 1 (so there is exactly one "1" in each row and each column). An example is the following 3-by-3 matrix: ( 0 1 0 0 0 1 1 0 0 ) {\displaystyle {\begin{pmatrix}0&1&0\\0&0&1\\1&0&0\end{pmatrix}}} A Birkhoff decomposition (also called: Birkhoff-von-Neumann decomposition) of a bistochastic matrix is a presentation of it as a sum of permutation matrices with non-negative weights. For example, the above matrix can be presented as the following sum: 0.2 ( 0 1 0 0 0 1 1 0 0 ) + 0.2 ( 1 0 0 0 1 0 0 0 1 ) + 0.1 ( 0 1 0 1 0 0 0 0 1 ) + 0.5 ( 0 0 1 1 0 0 0 1 0 ) {\displaystyle 0.2{\begin{pmatrix}0&1&0\\0&0&1\\1&0&0\end{pmatrix}}+0.2{\begin{pmatrix}1&0&0\\0&1&0\\0&0&1\end{pmatrix}}+0.1{\begin{pmatrix}0&1&0\\1&0&0\\0&0&1\end{pmatrix}}+0.5{\begin{pmatrix}0&0&1\\1&0&0\\0&1&0\end{pmatrix}}} Birkhoff's algorithm receives as input a bistochastic matrix and returns as output a Birkhoff decomposition. == Tools == A permutation set of an n-by-n matrix X is a set of n entries of X containing exactly one entry from each row and from each column. A theorem by Dénes Kőnig says that: Every bistochastic matrix has a permutation-set in which all entries are positive.The positivity graph of an n-by-n matrix X is a bipartite graph with 2n vertices, in which the vertices on one side are n rows and the vertices on the other side are the n columns, and there is an edge between a row and a column if the entry at that row and column is positive. A permutation set with positive entries is equivalent to a perfect matching in the positivity graph. A perfect matching in a bipartite graph can be found in polynomial time, e.g. using any algorithm for maximum cardinality matching. Kőnig's theorem is equivalent to the following:The positivity graph of any bistochastic matrix admits a perfect matching.A matrix is called scaled-bistochastic if all elements are non-negative, and the sum of each row and column equals c, where c is some positive constant. In other words, it is c times a bistochastic matrix. Since the positivity graph is not affected by scaling:The positivity graph of any scaled-bistochastic matrix admits a perfect matching. == Algorithm == Birkhoff's algorithm is a greedy algorithm: it greedily finds perfect matchings and removes them from the fractional matching. It works as follows. Let i = 1. Construct the positivity graph GX of X. Find a perfect matching in GX, corresponding to a positive permutation set in X. Let z[i] > 0 be the smallest entry in the permutation set. Let P[i] be a permutation matrix with 1 in the positive permutation set. Let X := X − z[i] P[i]. If X contains nonzero elements, Let i = i + 1 and go back to step 2. Otherwise, return the sum: z[1] P[1] + ... + z[2] P[2] + ... + z[i] P[i]. The algorithm is correct because, after step 6, the sum in each row and each column drops by z[i]. Therefore, the matrix X remains scaled-bistochastic. Therefore, in step 3, a perfect matching always exists. == Run-time complexity == By the selection of z[i] in step 4, in each iteration at least one element of X becomes 0. Therefore, the algorithm must end after at most n2 steps. However, the last step must simultaneously make n elements 0, so the algorithm ends after at most n2 − n + 1 steps, which implies O ( n 2 ) {\displaystyle O(n^{2})} . In 1960, Joshnson, Dulmage and Mendelsohn showed that Birkhoff's algorithm actually ends after at most n2 − 2n + 2 steps, which is tight in general (that is, in some cases n2 − 2n + 2 permutation matrices may be required). == Application in fair division == In the fair random assignment problem, there are n objects and n people with different preferences over the objects. It is required to give an object to each person. To attain fairness, the allocation is randomized: for each (person, object) pair, a probability is calculated, such that the sum of probabilities for each person and for each object is 1. The probabilistic-serial procedure can compute the probabilities such that each agent, looking at the matrix of probabilities, prefers his row of probabilities over the rows of all other people (this property is called envy-freeness). This raises the question of how to implement this randomized allocation in practice? One cannot just randomize for each object separately, since this may result in allocations in which some people get many objects while other people get no objects. Here, Birkhoff's algorithm is useful. The matrix of probabilities, calculated by the probabilistic-serial algorithm, is bistochastic. Birkhoff's algorithm can decompose it into a convex combination of permutation matrices. Each permutation matrix represents a deterministic assignment, in which every agent receives exactly one object. The coefficient of each such matrix is interpreted as a probability; based on the calculated probabilities, it is possible to pick one assignment at random and implement it. == Extensions == The problem of computing the Birkhoff decomposition with the minimum number of terms has been shown to be NP-hard, but some heuristics for computing it are known. This theorem can be extended for the general stochastic matrix with deterministic transition matrices. Budish, Che, Kojima and Milgrom generalize Birkhoff's algorithm to non-square matrices, with some constraints on the feasible assignments. They also present a decomposition algorithm that minimizes the variance in the expected values. Vazirani generalizes Birkhoff's algorithm to non-bipartite graphs. Valls et al. showed that it is possible to obtain an ϵ {\displaystyle \epsilon } -approximate decomposition with O ( log ⁡ ( 1 / ϵ 2 ) ) {\displaystyle O(\log(1/\epsilon ^{2}))} permutations.

    Read more →
  • Basic Formal Ontology

    Basic Formal Ontology

    Basic Formal Ontology (BFO) is a top-level ontology developed by Barry Smith and colleagues to promote interoperability among domain ontologies. The BFO methodology accomplishes this through a process of downward population. BFO is a formal ontology. The structure of BFO is based on a division of entities into two disjoint categories of continuant and occurrent, the former consists of objects and spatial regions, the latter contains processes conceived as extended through (or spanning) time. BFO thereby seeks to consolidate both time and space within a single framework A guide to building BFO-conformant domain ontologies was published by MIT Press in 2015. In 2021, the standard ISO/IEC 21838-2:2021 Information Technology — Top-level Ontologies (TLO) — Part 2: Basic Formal Ontology (BFO) was published by the Joint Technical Committee of the International Standards Organization and the International Electrotechnical Commission. ISO/IEC 21838 is a multi-part standard. Part 1 of the standard specifies the requirements that must be met if an ontology is to be classified as a top-level ontology by the standard. == History == BFO arose against the background of research in ontologies in the domain of geospatial information science by David Mark, Pierre Grenon, Achille Varzi and others, with a special role for the study of vagueness and of the ways sharp boundaries in the geospatial and other domains are created by fiat. BFO has passed through four major releases. 2001: release of BFO 1 2007: release of BFO 1.1 2015: release of BFO 2.0 2020: release of BFO 2020 2021: release of BFO 2020 as an ISO/IEC Standard The current revision was released in 2020, and this forms the basis of the standard ISO/IEC 21838-2, which was released by the Joint Committee of the International Standards Organization and International Electrotechnical Commission in 2021. == Applications == BFO has been adopted as a foundational ontology by over 650 ontology projects, principally in the areas of biomedical ontology, security and defense (intelligence) ontology, and industry ontologies. Example applications of BFO can be seen in the Ontology for Biomedical Investigations (OBI). In January 2024, BFO and the Common Core Ontologies (CCO), a suite of BFO-extension ontologies, were adopted as the "baseline standards for formal DOD and IC ontology" development work in the DOD and Intelligence Community. A memorandum to this effect was signed by the chief data officers of the DOD, the Office of the Director of National Intelligence and the Chief Digital and Artificial Intelligence Office.

    Read more →
  • EJB QL

    EJB QL

    EJB QL or EJB-QL is a portable database query language for Enterprise Java Beans. It was used in Java EE applications. Compared to SQL, however, it is less complex but less powerful as well. == History == The language has been inspired, especially EJB3-QL, by the native Hibernate Query Language. In EJB3 It has been mostly replaced by the Java Persistence Query Language. == Differences == EJB QL is a database query language similar to SQL. The used queries are somewhat different from relational SQL, as it uses a so-called "abstract schema" of the enterprise beans instead of the relational model. In other words, EJB QL queries do not use tables and their components, but enterprise beans, their persistent state, and their relationships. The result of an SQL query is a set of rows with a fixed number of columns. The result of an EJB QL query is either a single object, a collection of entity objects of a given type, or a collection of values retrieved from CMP fields. One has to understand the data model of enterprise beans in order to write effective queries.

    Read more →
  • Inbox by Gmail

    Inbox by Gmail

    Inbox by Gmail was an email service developed by Google. Announced on a limited invitation-only basis on October 22, 2014, it was officially released to the public on May 28, 2015. Inbox was shut down by Google on April 2, 2019. Available on the web, and through mobile apps for Android and iOS, Inbox by Gmail aimed to improve email productivity and organization through several key features. Bundles gathered emails on the same topic together; highlighted surface key details from messages, reminders and assists; and a "snooze" functionality enabled users to control when specific information would appear. Updates to the service enabled an "undo send" feature; a "Smart Reply" feature that automatically generated short reply examples for certain emails; integration with Google Calendar for event organization, previews of newsletters; and a "Save to Inbox" feature that let users save links for later use. Inbox by Gmail received generally positive reviews. At its launch, it was called "minimalist and lovely, full of layers and easy to navigate", with features deemed helpful in finding the right messages—one reviewer noted that the service felt "a lot like the future of email". However, it also received criticism, particularly for a low density of information, algorithms that needed tweaking, and because the service required users to "give up the control" of organizing their own email, meaning that "Anyone who already has a system for organizing their emails will likely find themselves fighting Google's system". Google noted in March 2016 that 10% of all replies on mobile originated from Inbox's Smart Reply feature. Google announced it would discontinue Inbox by Gmail in March 2019, with many of its features integrated into Gmail proper. == Features == Inbox by Gmail scanned the user's incoming Gmail messages for information. It gathered email messages related to the same overall topic into an organized bundle, with a title describing the bundle's content. For example, flight tickets, car rentals, and hotel reservations were grouped under "Travel", giving the user an easier overview of emails. Users could also group emails together manually, to "teach" the Inbox how the user worked. The service highlighted key details and important information in messages, such as flight itineraries, event information, photos and documents. Inbox could retrieve updated information from the Internet, including the real-time status of flights and package deliveries. Users could set reminders to bring up important messages later. When a user needed particular information, Inbox could assist the user by displaying the necessary details. Where Inbox highlights information was not needed immediately, users could "snooze" a message or reminder, with options to make the information reappear at a later time or specific location. In June 2015, Google added an "Undo Send" feature to Inbox, giving the user 10 seconds to undo sending a message. In November 2015, Google added "Smart Reply" functionality to the mobile apps. With Smart Reply, Inbox determined which emails could be answered with a short reply, generating three example responses from which the user could select one with a single tap. Smart Reply (initially available only on the Android and iOS mobile apps) was added to the Inbox website in March 2016, Google announcing that "10% of all your replies on mobile already use Smart Reply". By May 2017, Google said Smart Reply was driving about 12% of replies in inbox on mobile. In April 2016, Google updated Inbox with three new features; Google Calendar event organization, newsletter previews, and a "Save to Inbox" functionality that let the user save links for later use, rather than having to email links to themselves. In December 2017, Google introduced an "Unsubscribe" card that let users easily unsubscribe from mailing lists. The card appeared for email messages (from specific senders) that the user had not opened for a month. A few popular Inbox by Gmail features were subsequently added to Gmail: "Snoozing" of emails Nudges: Gmail could move old messages back to the top of the inbox when it thought a follow up or reply might be required. Hover actions: Placing the mouse cursor over a certain part of the message could quickly effect an action, such as archiving, without its being opened. Smart reply: This feature employed boilerplate text to suggest appropriate replies. Google reportedly wished, at a time then to be decided, to add the "bundles" feature to Gmail, which at the time was available only in Inbox for Gmail. By March 2020, many Inbox features were still missing from Gmail. == Platforms == Inbox by Gmail was announced on a limited invitation-only basis on October 22, 2014, available on the web, and through the Android and iOS mobile operating systems. It was officially released to the public on May 28, 2015. == Reception == David Pierce of The Verge praised the service, writing that it was "minimalist and lovely, full of layers and easy to navigate. It's remarkably fast and smooth on all platforms, and far better on iOS than the Gmail app". However, he criticized the app's low density of information, with only a few emails visible on the screen at a time, making it "a bit of a challenge" for users who need to go through "hundreds of emails" every day. Although positive that "Inbox feels a lot like the future of email", Pierce wrote that there was "plenty of algorithm tweaking and design condensing to do", with particular attention needed on a "compact view" for denser view of information on the screen. Sarah Mitroff of CNET also praised Inbox, writing, "Not only is it visually appealing, it's also full of features that help you find every message you need, when you need it". She added that users must "give up the control" to organize their email, and that it "won't vibe with everyone", but admitted that "if you're willing ... the app will reward you with a smarter and cleaner inbox." Mitroff noted that, initially, users had to coach the app about which bundle was appropriate for certain emails, writing, "It's a tedious process at first, by [sic] in just a few days Inbox starts to get it right." Regarding any downsides of the service, Mitroff wrote that "Inbox has a built-in strategy for managing your emails that works best on its own. Anyone who already has a system for organizing their emails will likely find themselves fighting Google's system". == Discontinuation and legacy == Google ended the service in March 2019. Google called Inbox "a great place to experiment with new ideas" and noted that many of those ideas had been migrated to Gmail. The company wanted, going forward, to focus its resources on a single email system. Several services, like Shortwave, attempted to resurrect some of the features of Inbox by Gmail to attract its old users. Similarly, Inbox Reborn, an actively maintained browser extension developed by a team of volunteer developers from around the world since 2018, aims to recreate the core features and visual style of Inbox by Gmail within the standard Gmail interface. The project continues to focus on preserving functionalities such as email bundling and streamlined workflows to provide users with a familiar productivity experience. Afterwards, most people moved to Spark, Spike, or Newton. According to a product manager at Google, a "more focused approach" regarding email was the companies goal. This is likely the reason they moved away from Inbox.

    Read more →
  • Shapiro–Senapathy algorithm

    Shapiro–Senapathy algorithm

    The Shapiro—Senapathy algorithm (S&S) is a computational method for identifying splice sites in eukaryotic genes. The algorithm employs a Position Weight Matrix (PWM) scoring formula to predict donor and acceptor splice sites in any given gene. This methodology has been used to discover splice sites and disease-causing splice site mutations in the human genome, and has become a standard tool in clinical genomics. The S&S algorithm has been cited in thousands of clinical studies, according to Google Scholar. It has also formed the basis of widely used software, including Human Splicing Finder, SROOGLE, and Alamut, which identify splice sites and splice site mutations that cause disease. The algorithm has uncovered splicing mutations in diseases ranging from cancers to inherited disorders, and predicted the deleterious effects of these mutations including exon skipping, intron retention, and cryptic splice site activation. == The algorithm == A splice site defines the boundary between a coding exon and a non-coding intron in eukaryotic genes. The S&S algorithm employs a sliding window, corresponding to the length of the splice site motif, to scan a gene sequence and detect potential splice sites. For each sliding window, the algorithm calculates a score by comparing the nucleotide sequence to a Position Weight Matrix (PWM) derived from known splice sites. This formula generates a percentile score, indicating the likelihood that a given sequence functions as a donor or acceptor splice site. The majority of disease-causing mutations in the human genome are located in splice sites. Clinical genomics studies analyze the splice site scores generated by the S&S algorithm to predict the consequences of splice site mutations including exon skipping and intron retention. The algorithm's sensitivity to single-nucleotide changes allows it to determine mutations that may impact RNA splicing and contribute to disease. In addition to identifying real splice sites, the S&S algorithm has been used to discover cryptic splice sites — alternative splice sites activated by mutations — which may disrupt normal splicing. The algorithm detects mutations that lead to the activation of cryptic splice sites, which may be located proximal to real splice sites or deep within non-coding introns. It has thus been used to determine the causes of numerous diseases that are due to cryptic splicing. == Cancer gene discovery using S&S == The S&S algorithm has been used to identify splice-site mutations in genes associated with several cancers. For example, genes causing commonly occurring cancers including breast cancer, ovarian cancer, colorectal cancer, leukemia, head and neck cancers, prostate cancer, retinoblastoma, squamous cell carcinoma, gastrointestinal cancer, melanoma, liver cancer, Lynch syndrome, skin cancer, and neurofibromatosis have been found. In addition, splicing mutations in genes causing less commonly known cancers including gastric cancer, gangliogliomas, Li-Fraumeni syndrome, Loeys–Dietz syndrome, Osteochondromas (bone tumor), Nevoid basal cell carcinoma syndrome, and Pheochromocytomas have been identified. Specific mutations in different splice sites in various genes causing breast cancer (e.g., BRCA1, PALB2), ovarian cancer (e.g., SLC9A3R1, COL7A1, HSD17B7), colon cancer (e.g., APC, MLH1, DPYD), colorectal cancer (e.g., COL3A1, APC, HLA-A), skin cancer (e.g., COL17A1, XPA, POLH), and Fanconi anemia (e.g., FANC, FANA) have been uncovered. The mutations in the donor and acceptor splice sites in different genes causing a variety of cancers that have been identified by S&S are shown in Table 1. == Discovery of genes causing inherited disorders using S&S == Specific mutations in different splice sites in various genes that cause inherited disorders, including, for example, Type 1 diabetes (e.g., PTPN22, TCF1 (HCF-1A)), hypertension (e.g., LDL, LDLR, LPL), Marfan syndrome (e.g., FBN1, TGFBR2, FBN2), cardiac diseases (e.g., COL1A2, MYBPC3, ACTC1), eye disorders (e.g., EVC, VSX1) have been uncovered. A few example mutations in the donor and acceptor splice sites in different genes causing a variety of inherited disorders identified using S&S are shown in Table 2. == Genes causing immune system disorders == More than 100 immune system disorders affect humans, including inflammatory bowel diseases, multiple sclerosis, systemic lupus erythematosus, bloom syndrome, familial cold autoinflammatory syndrome, and dyskeratosis congenita. The Shapiro–Senapathy algorithm has been used to discover genes and mutations involved in many immune disorder diseases, including Ataxia telangiectasia, B-cell defects, epidermolysis bullosa, and X-linked agammaglobulinemia. Xeroderma pigmentosum, an autosomal recessive disorder is caused by faulty proteins formed due to new preferred splice donor site identified using S&S algorithm and resulted in defective nucleotide excision repair. Type I Bartter syndrome (BS) is caused by mutations in the gene SLC12A1. S&S algorithm helped in disclosing the presence of two novel heterozygous mutations c.724 + 4A > G in intron 5 and c.2095delG in intron 16 leading to complete exon 5 skipping. Mutations in the MYH gene, which is responsible for removing the oxidatively damaged DNA lesion are cancer-susceptible in the individuals. The IVS1+5C plays a causative role in the activation of a cryptic splice donor site and the alternative splicing in intron 1, S&S algorithm shows, guanine (G) at the position of IVS+5 is well conserved (at the frequency of 84%) among primates. This also supported the fact that the G/C SNP in the conserved splice junction of the MYH gene causes the alternative splicing of intron 1 of the β type transcript. Splice site scores were calculated according to S&S to find EBV infection in X-linked lymphoproliferative disease. Identification of Familial tumoral calcinosis (FTC) is an autosomal recessive disorder characterized by ectopic calcifications and elevated serum phosphate levels and it is because of aberrant splicing. == Application of S&S in hospitals for clinical practice and research == The Shapiro–Senapathy (S&S) algorithm has played a significant role in advancing the diagnosis and treatment of human diseases through its application in modern clinical genomics. With the widespread adoption of next-generation sequencing (NGS) technologies, the S&S algorithm is now routinely integrated into clinical practice by geneticists and diagnostic laboratories. It is implemented in various computational tools such as Human Splicing Finder (HSF), Splice Site Finder (SSF), and Alamut Visual, which assist in interpreting the functional impact of genetic variants on RNA splicing. The algorithm is particularly useful in identifying pathogenic splice site mutations in cases where the clinical presentation is unclear or where conventional diagnostic methods have failed to identify a causative gene. Its utility has been demonstrated across diverse patient cohorts, including individuals from different ethnic backgrounds with various cancers and inherited genetic disorders. The following are selected examples illustrating its application in clinical research. === Cancers === === Inherited disorders === == S&S - Algorithm for identifying splice sites, exons and split genes == The Shapiro–Senapathy algorithm (SSA) was developed to identify splice sites in uncharacterized genomic sequences, with early applications in the Human Genome Project. The method introduced a Position Weight Matrix (PWM)-based approach to analyze splicing sequences across eukaryotic organisms, marking the first computational framework to systematically define splice sites using probabilistic scoring. Key innovations of the algorithm included: Exon Detection – Exons were defined as sequences bounded by acceptor and donor splice sites with S&S scores above a threshold, requiring an open reading frame (ORF) for validation. Gene Prediction – The method enabled the identification of complete genes by assembling predicted exons, forming a basis for later gene-finding tools. Mutation Analysis – The algorithm distinguishes deleterious splice-site mutations (which disrupt protein function by lowering S&S scores) from neutral variations. This capability allowed researchers to study disease-linked cryptic splice sites in humans, animals, and plants. SSA's PWM-based framework influenced subsequent computational methods, including machine learning and neural network approaches, for splice-site prediction and alternative splicing research. It remains a foundational tool in genomics and disease studies. == Discovering the mechanisms of aberrant splicing in diseases == The Shapiro–Senapathy algorithm has been used to determine the various aberrant splicing mechanisms in genes due to deleterious mutations in the splice sites, which cause numerous diseases. Deleterious splice site mutations impair the normal splicing of the gene transcripts, and thereby make the encoded protei

    Read more →
  • EdgeRank

    EdgeRank

    EdgeRank is the name commonly given to the algorithm that Facebook uses to determine what articles should be displayed in a user's News Feed. As of 2011, Facebook has stopped using the EdgeRank system and uses a machine learning algorithm that, as of 2013, takes more than 100,000 factors into account. EdgeRank was developed and implemented by Serkan Piantino. == Formula and factors == In 2010, a simplified version of the EdgeRank algorithm was presented as: ∑ e d g e s e u e w e d e {\displaystyle \sum _{\mathrm {edges\,} e}u_{e}w_{e}d_{e}} where: u e {\displaystyle u_{e}} is user affinity. w e {\displaystyle w_{e}} is how the content is weighted. d e {\displaystyle d_{e}} is a time-based decay parameter. User Affinity: The User Affinity part of the algorithm in Facebook's EdgeRank looks at the relationship and proximity of the user and the content (post/status update). Content Weight: What action was taken by the user on the content. Time-Based Decay Parameter: New or old. Newer posts tend to hold a higher place than older posts. Some of the methods that Facebook uses to adjust the parameters are proprietary and not available to the public. A study has shown that it is possible to hypothesize a disadvantage of the "like" reaction and advantages of other interactions (e.g., the "haha" reaction or "comments") in content algorithmic ranking on Facebook. The "like" button can decrease the organic reach as a "brake effect of viral reach". The "haha" reaction, "comments" and the "love" reaction could achieve the highest increase in total organic reach. == Impact == EdgeRank and its successors have a broad impact on what users actually see out of what they ostensibly follow: for instance, the selection can produce a filter bubble (if users are exposed to updates which confirm their opinions etc.) or alter people's mood (if users are shown a disproportionate amount of positive or negative updates). As a result, for Facebook pages, the typical engagement rate is less than 1% (or less than 0.1% for the bigger ones), and organic reach 10% or less for most non-profits. As a consequence, for pages, it may be nearly impossible to reach any significant audience without paying to promote their content.

    Read more →
  • Long division

    Long division

    In arithmetic, long division is a standard division algorithm suitable for dividing multi-digit numbers that is simple enough to perform by hand. It breaks down a division problem into a series of easier steps. As in all division problems, one number, called the dividend, is divided by another, called the divisor, producing a result called the quotient. It enables computations involving arbitrarily large numbers to be performed by following a series of simple steps. The abbreviated form of long division is called short division, which is almost always used instead of long division when the divisor has only one digit. == History == Related algorithms have existed since the 12th century. Al-Samawal al-Maghribi (1125–1174) performed calculations with decimal numbers that essentially require long division, leading to infinite decimal results, but without formalizing the algorithm. Caldrini (1491) is the earliest printed example of long division, known as the Danda method in medieval Italy, and it became more practical with the introduction of decimal notation for fractions by Pitiscus (1608). The specific algorithm in modern use was introduced by Henry Briggs c. 1600. == Education == Inexpensive calculators and computers have become the most common tools for performing division in educational and professional contexts worldwide, reducing reliance on traditional paper-and-pencil techniques. Internally, these devices implement various division algorithms, many of which rely on iterative approximations and multiplication to improve computational efficiency. Educational approaches to teaching division vary across countries and regions, reflecting differing curricular priorities. In North America, long division has been de-emphasized or, in some cases, removed from portions of the curriculum as part of reform mathematics, which emphasizes conceptual understanding and the use of technology. In contrast, many education systems in Europe and Asia continue to emphasize mastery of standard algorithms, including long division, as a foundational arithmetic skill. For example, curricula in countries such as Japan and Germany typically introduce and reinforce long division during primary education, often alongside mental arithmetic strategies and problem-solving techniques. International assessments such as the Trends in International Mathematics and Science Study (TIMSS) highlight these differences, showing variation in how procedural fluency and conceptual understanding are balanced across educational systems. These differing approaches reflect broader educational philosophies regarding the balance between procedural fluency, conceptual understanding, and the role of technology in mathematics education. == Method == In English-speaking countries, long division does not use the division slash ⟨∕⟩ or division sign ⟨÷⟩ symbols but instead constructs a tableau. The divisor is separated from the dividend by a right parenthesis ⟨)⟩ or vertical bar ⟨|⟩; the dividend is separated from the quotient by a vinculum (i.e., an overbar). The combination of these two symbols is sometimes known as a long division symbol, division bracket, or even a bus stop. It developed in the 18th century from an earlier single-line notation separating the dividend from the quotient by a left parenthesis. The process is begun by dividing the left-most digit of the dividend by the divisor. The quotient (rounded down to an integer) becomes the first digit of the result, and the remainder is calculated (this step is notated as a subtraction). This remainder carries forward when the process is repeated on the following digit of the dividend (notated as 'bringing down' the next digit to the remainder). When all digits have been processed and no remainder is left, the process is complete. An example is shown below, representing the division of 500 by 4 (with a result of 125). 125 (Explanations) 4)500 4 ( 4 × 1 = 4) 10 ( 5 - 4 = 1) 8 ( 4 × 2 = 8) 20 (10 - 8 = 2) 20 ( 4 × 5 = 20) 0 (20 - 20 = 0) A more detailed breakdown of the steps goes as follows: Find the shortest sequence of digits starting from the left end of the dividend, 500, that the divisor 4 goes into at least once. In this case, this is simply the first digit, 5. The largest number that the divisor 4 can be multiplied by without exceeding 5 is 1, so the digit 1 is put above the 5 to start constructing the quotient. Next, the 1 is multiplied by the divisor 4, to obtain the largest whole number that is a multiple of the divisor 4 without exceeding the 5 (4 in this case). This 4 is then placed under and subtracted from the 5 to get the remainder, 1, which is placed under the 4 under the 5. Afterwards, the first as-yet unused digit in the dividend, in this case the first digit 0 after the 5, is copied directly underneath itself and next to the remainder 1, to form the number 10. At this point the process is repeated enough times to reach a stopping point: The largest number by which the divisor 4 can be multiplied without exceeding 10 is 2, so 2 is written above as the second leftmost quotient digit. This 2 is then multiplied by the divisor 4 to get 8, which is the largest multiple of 4 that does not exceed 10; so 8 is written below 10, and the subtraction 10 minus 8 is performed to get the remainder 2, which is placed below the 8. The next digit of the dividend (the last 0 in 500) is copied directly below itself and next to the remainder 2 to form 20. Then the largest number by which the divisor 4 can be multiplied without exceeding 20, which is 5, is placed above as the third leftmost quotient digit. This 5 is multiplied by the divisor 4 to get 20, which is written below and subtracted from the existing 20 to yield the remainder 0, which is then written below the second 20. At this point, since there are no more digits to bring down from the dividend and the last subtraction result was 0, we can be assured that the process finished. If the last remainder when we ran out of dividend digits had been something other than 0, there would have been two possible courses of action: We could just stop there and say that the dividend divided by the divisor is the quotient written at the top with the remainder written at the bottom, and write the answer as the quotient followed by a fraction that is the remainder divided by the divisor. We could extend the dividend by writing it as, say, 500.000... and continue the process (using a decimal point in the quotient directly above the decimal point in the dividend), in order to get a decimal answer, as in the following example. 31.75 4)127.00 12 (12 ÷ 4 = 3) 07 (0 remainder, bring down next figure) 4 (7 ÷ 4 = 1 r 3) 3.0 (bring down 0 and the decimal point) 2.8 (7 × 4 = 28, 30 ÷ 4 = 7 r 2) 20 (an additional zero is brought down) 20 (5 × 4 = 20) 0 In this example, the decimal part of the result is calculated by continuing the process beyond the units digit, "bringing down" zeros as being the decimal part of the dividend. This example also illustrates that, at the beginning of the process, a step that produces a zero can be omitted. Since the first digit 1 is less than the divisor 4, the first step is instead performed on the first two digits 12. Similarly, if the divisor were 13, one would perform the first step on 127 rather than 12 or 1. === Basic procedure for long division of n ÷ m === Find the location of all decimal points in the dividend n and divisor m. If necessary, simplify the long division problem by moving the decimals of the divisor and dividend by the same number of decimal places, to the right (or to the left), so that the decimal of the divisor is to the right of the last digit. When doing long division, keep the numbers lined up straight from top to bottom under the tableau. After each step, be sure the remainder for that step is less than the divisor. If it is not, there are three possible problems: the multiplication is wrong, the subtraction is wrong, or a greater quotient is needed. In the end, the remainder, r, is added to the growing quotient as a fraction, r⁄m. === Invariant property and correctness === The basic presentation of the steps of the process (above) focuses on what steps are to be performed, rather than the properties of those steps that ensure the result will be correct (specifically, that q × m + r = n, where q is the final quotient and r the final remainder). A slight variation of presentation requires more writing, and requires that we change, rather than just update, digits of the quotient, but can shed more light on why these steps actually produce the right answer by allowing evaluation of q × m + r at intermediate points in the process. This illustrates the key property used in the derivation of the algorithm (below). Specifically, we amend the above basic procedure so that we fill the space after the digits of the quotient under construction with 0's, to at least the 1's place, and include those 0's in the numbers we write below the division bra

    Read more →
  • Generative design

    Generative design

    Generative design is an iterative design process that uses software to generate outputs that fulfill a set of constraints iteratively adjusted by a designer. Whether a human, test program, or artificial intelligence, the designer algorithmically or manually refines the feasible region of the program's inputs and outputs with each iteration to fulfill evolving design requirements. By employing computing power to evaluate more design permutations than a human alone is capable of, the process is capable of producing an optimal design that mimics nature's evolutionary approach to design through genetic variation and selection. The output can be images, sounds, architectural models, animation, and much more. It is, therefore, a fast method of exploring design possibilities that is used in various design fields such as art, architecture, communication design, and product design. Generative design has become more important, largely due to new programming environments or scripting capabilities that have made it relatively easy, even for designers with little programming experience, to implement their ideas. Additionally, this process can create solutions to substantially complex problems that would otherwise be resource-exhaustive with an alternative approach, making it a more attractive option for problems with a large or unknown solution set. It is also facilitated with tools in commercially available CAD packages. Not only are implementation tools more accessible, but also tools leveraging generative design as a foundation. Recent advancements have led to the development of Deep Generative Design, a framework that integrates topology optimization with deep learning models, such as Generative Adversarial Networks (GANs). Unlike traditional evolutionary methods that primarily focus on engineering performance, this approach uses deep generative models to enhance aesthetic diversity and novelty while simultaneously satisfying engineering constraints. For instance, research by Oh et al. (2019) proposed a framework using Boundary Equilibrium GANs (BEGAN) to generate diverse design options which are then refined through density-based topology optimization, allowing for the exploration of complex design spaces that balance structural integrity with visual variation. In practice, generative design does not solely aim to produce a single optimal solution, but involves iteratively refining the design problem by modifying parameters, constraints, and evaluation criteria within a computational model, resulting in multiple design alternatives from which the designer selects. == Use in architecture == Generative design in architecture is an iterative design process that enables architects to explore a wider solution space with more possibility and creativity. Architectural design has long been regarded as a wicked problem. Compared with traditional top-down design approach, generative design can address design problems efficiently, by using a bottom-up paradigm that uses parametric-defined rules to generate complex solutions. The solution itself then evolves to a good, if not optimal, solution. The advantage of using generative design as a design tool is that it does not construct fixed geometries, but take a set of design rules that can generate an infinite set of possible design solutions. The generated design solutions can be more sensitive, responsive, and adaptive to the problem. Generative design involves rule definition and result analysis that are integrated with the design process. By defining parameters and rules, the generative approach is able to provide optimized solution for both structural stability and aesthetics. Possible design algorithms include cellular automata, shape grammar, genetic algorithm, space syntax, and most recently, artificial neural network. Due to the high complexity of the solution generated, rule-based computational tools, such as finite element method and topology optimisation, are preferred to evaluate and optimise the generated solution. The iterative process provided by computer software enables the trial-and-error approach in design, and involves architects interfering with the optimisation process. Historically precedent work includes Antoni Gaudí's Sagrada Família, which used rule based geometrical forms for structures, and Buckminster Fuller's Montreal Biosphere where the rules were designed to generate individual components, rather than the final product. More recent generative-design cases include Foster and Partners' Queen Elizabeth II Great Court, where the tessellated glass roof was designed using a geometric schema to define hierarchical relationships, and then the generated solution was optimized based on geometrical and structural requirements. == Use in sustainable design == Generative design in sustainable design is an effective approach addressing energy efficiency and climate change at the early design stage, recognizing buildings contribute to approximately one-third of global greenhouse gas emissions and 30%-40% of total building energy use. It integrates environmental principles with algorithms, enabling exploration of countless design alternatives to enhance energy performance, reduce carbon footprints, and minimize waste. A key feature of generative design in sustainable design is its ability to incorporate Building Performance Simulations (BPS) into the design process. Simulation programs such as EnergyPlus, Ladybug Tools,, and so on, combined with generative algorithms, can optimize design solutions for cost-effective energy use and zero-carbon building designs. For example, the GENE_ARCH system used a Pareto algorithm with building energy simulation for the whole building design optimization. Generative design has improved sustainable facade design, as illustrated by the algorithm of cellular automata and daylight simulations in adaptive facade design. In addition, genetic algorithms were used with radiation simulations for energy-efficient photo-voltaic (PV) modules on high-rise building facades. Generative design is also applied to life cycle analysis (LCA), as demonstrated by a framework using grid search algorithms to optimize exterior wall design for minimum environmental impact. Multi-objective optimization embraces multiple diverse sustainability goals, such as interactive kinetic louvers using biomimicry and daylight simulations to enhance daylight, visual comfort, and energy efficiency. The study of PV and shading systems can maximize on-site electricity, improve visual quality, and daylight performance. Artificial intelligence (AI) and machine learning (ML) further improve computation efficiency in complex climate-responsive sustainable design. One study employed reinforcement learning to identify the relationship between design parameters and energy use for a sustainable campus, while other studies tried hybrid algorithms, such as using the genetic algorithm and GANs to balance daylight illumination and thermal comfort under different roof conditions. Other popular AI tools were also integrated, including deep reinforcement learning (DRL) and computer vision (CV), to generate an urban block according to direct sunlight hours and solar heat gains. These AI-driven generative design methods enable faster simulations and design decision making, resulting in designs that are environmentally responsible. == Use in additive manufacturing == Additive manufacturing (AM) is a process that creates physical models directly from three-dimensional (3D) data by joining materials layer by layer. It is used in industries to produce a variety of end-use parts, which are final components designed for direct application in products or systems. AM provides design flexibility and enables material reduction in lightweight applications, such as aerospace, automotive, medical, and portable electronic devices, where minimizing weight is critical for performance. Generative design, one of the four key methods for lightweight design in AM, is commonly applied to optimize structures for specific performance requirements. Generative design can help create optimized solutions that balance multiple objectives, such as enhancing performance while minimizing cost. In design for additive manufacturing (DfAM), multi-objective topology optimization is used to generate a set of candidate solutions. Designers then assess these options using their expertise and key performance indicators (KPIs) to select the best option for implementation. However, integrating AM constraints (e.g., speed of build, materials, build envelope, and accuracy) into generative design remains challenging, as ensuring all solutions are valid is complex. Balancing multiple design objectives while limiting computational costs adds further challenges for designers. To overcome these difficulties, researchers proposed a generative design method with manufacturing validation to improve decision-making efficiency. This method starts with a cons

    Read more →
  • Automated journalism

    Automated journalism

    Automated journalism, also known as algorithmic journalism or robot journalism, is a term that attempts to describe modern technological processes that are now in use in the journalistic profession, such as news articles and videos generated by computer programs. There are four main fields of application for automated journalism, namely automated content production, data mining, news dissemination and content optimization. Through generative artificial intelligence, stories are produced automatically by computers rather than human reporters. In the 2020s, generative pre-trained transformers have enabled the generation of articles, simply by providing prompts. Automated journalism is sometimes seen as an opportunity to free journalists from routine reporting, providing them with more time for complex tasks. It also allows efficiency and cost-cutting, alleviating some financial burden that many news organizations face. However, automated journalism is also perceived as a threat to the authorship and quality of news and a threat to the livelihoods of human journalists. == History == Historically, the process involved an algorithm that scanned large amounts of provided data, selected from an assortment of pre-programmed article structures, ordered key points, and inserted details such as names, places, amounts, rankings, statistics, and other figures. These programs interpret, organize, and present data in human-readable ways. The output can also be customized to fit a certain voice, tone, or style. Early implementations were mainly used for stories based on statistics and numerical figures. Common topics include sports recaps, weather, financial reports, real estate analysis, and earnings reviews. Data science and AI companies such as Automated Insights, Narrative Science, United Robots and Monok develop and provide these algorithms to news outlets. In 2016, early adopters included news providers such as the Associated Press, Forbes, ProPublica, and the Los Angeles Times. StatSheet, an online platform covering college basketball, runs entirely on an automated program. In 2006, Thomson Reuters announced their switch to automation to generate financial news stories on its online news platform. Reuters used a tool called Tracer. An algorithm called Quakebot published a story about a 2014 California earthquake on The Los Angeles Times website within three minutes after the shaking had stopped. The Associated Press began using automation to cover 10,000 minor baseball leagues games annually, using a program from Automated Insights and statistics from MLB Advanced Media. Outside of sports, the Associated Press also uses automation to produce stories on corporate earnings. Since 2014, Associated Press has been publishing quarterly financial stories with help from Automated Insights. In May 2020, Microsoft announced that a number of its MSN contract journalists would be replaced by robot journalism. On 8 September 2020, The Guardian published an article entirely written by the neural network GPT-3, although the published fragments were manually picked by a human editor. Agentic Tribune produces all of its news articles automatically using AI. News broadcasters in Kuwait, Greece, South Korea, India, China and Taiwan have presented news with anchors based on generative AI models, prompting concerns about job losses for human anchors and audience trust in news that has historically been influenced by parasocial relationships with broadcasters, content creators or social media influencers. Algorithmically generated anchors have also been used by allies of ISIS for their broadcasts. In 2023, Google reportedly pitched a tool to news outlets that claimed to "produce news stories" based on input data provided, such as "details of current events". Some news company executives who viewed the pitch described it as "[taking] for granted the effort that went into producing accurate and artful news stories." In February 2024, Google launched a program to pay small publishers to write three articles per day using a beta generative AI model. The program does not require the knowledge or consent of the websites that the publishers are using as sources, nor does it require the published articles to be labeled as being created or assisted by these models. Meta AI, a chatbot based on Llama 3 which summarizes news stories, was noted by The Washington Post to copy sentences from those stories without direct attribution and to potentially further decrease the traffic of online news outlets. == Benefits == === Speed === Robot reporters are built to produce large quantities of information at quicker speeds. The Associated Press announced that their use of automation has increased the volume of earnings reports from customers by more than ten times. With software from Automated Insights and data from other companies, they can produce 150 to 300-word articles in the same time it takes journalists to crunch numbers and prepare information. By automating routine stories and tasks, journalists are promised more time for complex jobs such as investigative reporting and in-depth analysis of events. Francesco Marconi of the Associated Press stated that, through automation, the news agency freed up 20 percent of reporters’ time to focus on higher-impact projects. This has also been stated by a spokesperson at Gannett, who stated "By leveraging AI, we are able to expand coverage and enable our journalists to focus on more in-depth reporting." GBH reports that AI tools help increase the reach of news publishers. Mike Carragi, a product manager at Patch, stated that they were able to increase their reach from 1200 communities to 7000 communities in just a few months without the need for new employees solely through the adoption of generative AI. In fact, many communities are served solely by AI generated content, which creates summaries of existing information within the community. === Cost === Automated journalism is cheaper because more content can be produced within less time. It also lowers labour costs for news organizations. Reduced human input means less expenses on wages or salaries, paid leaves, vacations, and employment insurance. Automation serves as a cost-cutting tool for news outlets struggling with tight budgets but still wish to maintain the scope and quality of their coverage. == Concerns == === Authorship === In an automated story, there is often confusion about who should be credited as the author. Several participants of a study on algorithmic authorship attributed the credit to the programmer; others perceived the news organization as the author, emphasizing the collaborative nature of the work. There is also no way for the reader to verify whether an article was written by a robot or human, which raises issues of transparency although such issues also arise with respect to authorship attribution between human authors too. === Credibility and quality === Concerns about the perceived credibility of automated news is similar to concerns about the perceived credibility of news in general. Critics doubt if algorithms are "fair and accurate, free from subjectivity, error, or attempted influence." Again, these issues about fairness, accuracy, subjectivity, error, and attempts at influence or propaganda has also been present in articles written by humans over thousands of years. A common criticism is that machines do not replace human capabilities such as creativity, humour, and critical-thinking. However, as the technology evolves, the aim is to mimic human characteristics. When the UK's Guardian newspaper used an AI to write an entire article in September 2020, commentators pointed out that the AI still relied on human editorial content. Austin Tanney, the head of AI at Kainos said: "The Guardian got three or four different articles and spliced them together. They also gave it the opening paragraph. It doesn’t belittle what it is. It was written by AI, but there was human editorial on that." The largest single study of readers' evaluations of news articles produced with and without the help of automation exposed 3,135 online news consumers to 24 articles. It found articles that had been automated were significantly less comprehensible, in part because they were considered to contain too many numbers. However, the automated articles were evaluated equally on other criteria including tone, narrative flow, and narrative structure. Beyond human evaluation, there are now numerous algorithmic methods to identify machine written articles although some articles may still contain errors that are obvious for a human to identify, they can at times score better with these automatic identifiers than human-written articles. A 2017 Nieman Reports article by Nicola Bruno discusses whether or not machines will replace journalists and addresses concerns around the concept of automated journalism practices. Ultimately, Bruno came to the conclusion that AI would assist journalist

    Read more →
  • Species distribution modelling

    Species distribution modelling

    Species distribution modelling (SDM), also known as environmental (or ecological) niche modelling (ENM), habitat suitability modelling, predictive habitat distribution modelling, and range mapping uses ecological models to predict the distribution of a species across geographic space and time using environmental data. The environmental data are most often climate data (e.g. temperature, precipitation), but can include other variables such as soil type, water depth, and land cover. SDMs are used in several research areas in conservation biology, ecology and evolution. These models can be used to understand how environmental conditions influence the occurrence or abundance of a species, and for predictive purposes (ecological forecasting). Predictions from an SDM may be of a species' future distribution under climate change, a species' past distribution in order to assess evolutionary relationships, or the potential future distribution of an invasive species. Predictions of current and/or future habitat suitability can be useful for management applications (e.g. reintroduction or translocation of vulnerable species, reserve placement in anticipation of climate change). There are two main types of SDMs. Correlative SDMs, also known as climate envelope models, bioclimatic models, or resource selection function models, model the observed distribution of a species as a function of environmental conditions. Mechanistic SDMs, also known as process-based models or biophysical models, use independently derived information about a species' physiology to develop a model of the environmental conditions under which the species can exist. The extent to which such modelled data reflect real-world species distributions will depend on a number of factors, including the nature, complexity, and accuracy of the models used and the quality of the available environmental data layers; the availability of sufficient and reliable species distribution data as model input; and the influence of various factors such as barriers to dispersal, geologic history, or biotic interactions, that increase the difference between the realized niche and the fundamental niche. Environmental niche modelling may be considered a part of the discipline of biodiversity informatics. == History == A. F. W. Schimper used geographical and environmental factors to explain plant distributions in his 1898 Pflanzengeographie auf physiologischer Grundlage (Plant Geography Upon a Physiological Basis) and his 1908 work of the same name. Andrew Murray used the environment to explain the distribution of mammals in his 1866 The Geographical Distribution of Mammals. Robert Whittaker's work with plants and Robert MacArthur's work with birds strongly established the role the environment plays in species distributions. Elgene O. Box constructed environmental envelope models to predict the range of tree species. His computer simulations were among the earliest uses of species distribution modelling. The adoption of more sophisticated generalised linear models (GLMs) made it possible to create more sophisticated and realistic species distribution models. The expansion of remote sensing and the development of GIS-based environmental modelling increase the amount of environmental information available for model-building and made it easier to use. == Correlative vs mechanistic models == === Correlative SDMs === SDMs originated as correlative models. Correlative SDMs model the observed distribution of a species as a function of geographically referenced climatic predictor variables using multiple regression approaches. Given a set of geographically referred observed presences of a species and a set of climate maps, a model defines the most likely environmental ranges within which a species lives. Correlative SDMs assume that species are at equilibrium with their environment and that the relevant environmental variables have been adequately sampled. The models allow for interpolation between a limited number of species occurrences. For these models to be effective, it is required to gather observations not only of species presences, but also of absences, that is, where the species does not live. Records of species absences are typically not as common as records of presences, thus often "random background" or "pseudo-absence" data are used to fit these models. If there are incomplete records of species occurrences, pseudo-absences can introduce bias. Since correlative SDMs are models of a species' observed distribution, they are models of the realized niche (the environments where a species is found), as opposed to the fundamental niche (the environments where a species can be found, or where the abiotic environment is appropriate for the survival). For a given species, the realized and fundamental niches might be the same, but if a species is geographically confined due to dispersal limitation or species interactions, the realized niche will be smaller than the fundamental niche. Correlative SDMs are easier and faster to implement than mechanistic SDMs, and can make ready use of available data. Since they are correlative however, they do not provide much information about causal mechanisms and are not good for extrapolation. They will also be inaccurate if the observed species range is not at equilibrium (e.g. if a species has been recently introduced and is actively expanding its range). In standard SDMs, the distribution of a single species is often modeled, with unique parameters describing how environmental (abiotic) factors influence its occurrence probability. This allows for differentiated responses to environmental drivers among species, but can be problematic for data-deficient species. In contrast, similarities in environmental responses can be accounted for in multi-species SDMs, which model several species jointly using shared or hierarchically related parameters. However, neither approach explicitly accounts for community-level biotic interactions, which can be important in explaining species diversity patterns. Joint species distribution models (joint SDMs or J-SDMs) address this by modeling species co-occurrence patterns directly. The occurrence probability of a given species is thus influenced not only by abiotic drivers but also by inferred biotic associations with other species. This can improve accuracy for rarer taxa and provide insights into community ecology. Both standard SDMs and J-SDMs can be used to generate community-level metrics, such as species richness, by aggregating outputs across multiple species. These can be important for decision-making such as conservation planning. === Mechanistic SDMs === Mechanistic SDMs are more recently developed. In contrast to correlative models, mechanistic SDMs use physiological information about a species (taken from controlled field or laboratory studies) to determine the range of environmental conditions within which the species can persist. These models aim to directly characterize the fundamental niche, and to project it onto the landscape. A simple model may simply identify threshold values outside of which a species can't survive. A more complex model may consist of several sub-models, e.g. micro-climate conditions given macro-climate conditions, body temperature given micro-climate conditions, fitness or other biological rates (e.g. survival, fecundity) given body temperature (thermal performance curves), resource or energy requirements, and population dynamics. Geographically referenced environmental data are used as model inputs. Because the species distribution predictions are independent of the species' known range, these models are especially useful for species whose range is actively shifting and not at equilibrium, such as invasive species. Mechanistic SDMs incorporate causal mechanisms and are better for extrapolation and non-equilibrium situations. However, they are more labor-intensive to create than correlational models and require the collection and validation of a lot of physiological data, which may not be readily available. The models require many assumptions and parameter estimates, and they can become very complicated. Dispersal, biotic interactions, and evolutionary processes present challenges, as they aren't usually incorporated into either correlative or mechanistic models. Correlational and mechanistic models can be used in combination to gain additional insights. For example, a mechanistic model could be used to identify areas that are clearly outside the species' fundamental niche, and these areas can be marked as absences or excluded from analysis. See for a comparison between mechanistic and correlative models. == Niche models (correlative) == There are a variety of mathematical methods that can be used for fitting, selecting, and evaluating correlative SDMs. Models include "profile" methods, which are simple statistical techniques that use e.g. environmental distance to known sites of occurrence such as

    Read more →
  • Penril

    Penril

    Penril DataComm Networks, Inc. was a computer telecommunications hardware company that made some acquisitions and was eventually split into two parts: one was acquired by Bay Networks and the other was a newly formed company named Access Beyond. The focus of both company's products was end-to-end data transfer. By the mid-1990s, with the popularization of the internet, this was no longer of wide interest. == History == Penril, whose earnings reports and other financials were followed by The New York Times in the 1990s, made several acquisitions but also grew internally. Following its Datability acquisition it renamed itself Penril Datability Networks. By the time the 1968-founded Penril was acquired by Bay their name was Penril DataComm Networks. The company, which as of 1985 "had made 14 acquisitions in 12 years," also had done extensive work regarding quality control, and leveraged their product line by what The Washington Post called clever packaging: "software, cables, instructions and telephone support" sold to those less technically skilled as "Network in a Box." == Datability == Datability Software Systems Inc. was the initial name of what by 1991 became 'Datability, Inc.', "a manufacturer of hardware that links computer networks." The 1977-founded firm began as a software consulting company, especially in the area of databases. To speed up project development they built a program generator, which they marketed as Control 10/20 (targeted at users of Digital Equipment Corporation's DECsystem-10 and DECSYSTEM-20). After trying their hand at time-sharing they built hardware to enhance bridging these computers to DEC's VAX product line. In particular they focused on Digital's LAT protocol, selling "boxes" that reimplemented the protocol, at a lower price than DEC's. They later expanded into other areas of telecommunications hardware The firm relocated to a larger manufacturing plant in 1991 and was acquired by Penril in 1993. == Access Beyond == Access Beyond was initially housed by Penril, from which it was spun off. A securities analyst noted that Access began operations with no debt. They subsequently merged with Hayes Corporation. Some of the funds brought to the merger came from a sale by Penril of two of its divisions, each bringing about $4 million. == Ron Howard == Ron Howard, founder of Datability, became part of Penril when the latter acquired the former, and was CEO of Access Beyond when it was spun off by Penril. Access merged with Hayes Microcomputer Products and was renamed Hayes Corp, at which time Howard became executive VP of business development and corporate vice chairman of Hayes. == People == In the matter of hiring immigrants, in an industry where recent arrivals came from a culture of six day work weeks, and subcontracting was then common, these assembly line workers at Penril comprised about 25%, compared to double in other firms. Placement was overseen by government agencies. == Controversy == Penril had a joint development agreement, beginning in 1990, with a Standard Microsystems Corporation (SMSC) subsidiary. A dispute arose, and the matter was brought to court. Penril was awarded $3.5 million in 1996.

    Read more →
  • VSCO

    VSCO

    VSCO ( ), formerly known as VSCO Cam, is a photography mobile app available for iOS and Android devices. The app was created by Joel Flory and Greg Lutze. The VSCO app allows users to capture photos in the app and edit them, using preset filters and editing tools. == History == Visual Supply Company was founded by Joel Flory and Greg Lutze in California, in 2011. VSCO was launched in 2012. It raised $40 million from investors in May 2014. In 2017, VSCO launched a subscription model. As of 2018, Visual Supply Company has $90 million in funding from investors and over 2 million paying members. In 2019, VSCO acquired Rylo, a video editing startup founded by the original developer of Instagram’s Hyperlapse. Visual Supply Company has locations in Oakland, California, where it is headquartered, and Chicago, Illinois. In December 2020 VSCO acquired AI-powered video editing app Trash. In April 2018, VSCO reached over 30 million users. In September 2023, Eric Wittman was appointed as the new CEO and co-founder Joel Flory became executive chairman. == Usage == Users must register an account to use the app. Photos can be taken or imported from the camera roll, as well as short videos or animated GIFs (known in the app as DSCO; iOS only). The user can edit their photos through various preset filters, or through the "toolkit" feature which allows finer adjustments to fade, clarity, skin tone, tint, sharpness, saturation, contrast, temperature, exposure, and other properties. Users have the option of posting their photos to their profile, where they can also add captions and hashtags. Photos can also be exported back into the camera roll or shared with other social networking services. The users also have an option to edit their own videos from their camera roll with the VSCO yearly membership, but they are not able to post camera roll as VSCO Film X videos to their account on VSCO. JPEG and raw image files can be used. Research on image based social media platforms has found that engagement with posting, editing, and interacting with images can influence users' mood, self esteem, and body satisfaction. Studies also suggest that greater emotional investment in social media content is associated with increased negative psychological outcomes including stress and depressive symptoms. == In popular culture == VSCO's Oakland headquarters was a key filming location for Boots Riley's 2018 film Sorry to Bother You.

    Read more →
  • Sedona Canada Principles

    Sedona Canada Principles

    The Sedona Canada Principles are a set of authoritative guidelines published by The Sedona Conference to aid members of the Canadian legal community involved in the identification, collection, preservation, review and production of electronically stored information (ESI). The principles were drafted by a small group of lawyers, judges and technologists called the Sedona Working Group 7 or Sedona Canada. Sedona Canada is an offshoot of The Sedona Conference which is an American "non-profit ... research and educational institute dedicated to the advanced study of law and policy in the areas of antitrust law, complex litigation, and intellectual property rights". == Background == Civil procedure in Canada is jurisdictional with each province following its own rules of civil procedure. However, each province must address the fact that due to the advancement of technology the discovery process enshrined in the rules of civil procedure can be potentially derailed due to the sheer volume of electronically stored information (ESI). When dealing with litigation matters that involve electronically stored information (ESI), the discovery process is commonly called e-discovery. The problems associated with e-discovery in Canada led to the creation of the Sedona Canada Principles. Rule 29.1.03(4) of the wikibooks:Ontario Rules of Civil Procedure specifically refers to the Sedona Canada Principles in referencing Principles re Electronic Discovery although it has been reported that this rule has been largely ignored in practice. == Summary == The Sedona Canada Principles largely refer to the processes found in the Electronic Discovery Reference Model. The principles urge proportionality due to the potentially enormous volumes of documents that may be discoverable when dealing with ESI. They also encourage good faith in the document preservation stage and regular meetings between parties to discuss the scope of the litigation. Parties are urged to be aware of the potential costs involved in producing relevant ESI but are advised that only reasonably accessible ESI need be produced. The principles stipulate that parties should not be required to search for or collect deleted material unless there is an agreement or court order related to those terms. The use of electronic tools and processes such as data sampling and web harvesting are acceptable practices. Parties are encouraged to agree early in the litigation process on production format required for the exchange of relevant documents as part of the discovery process (native files, pdf, tiff, metadata requirements etc.). Agreements or direction should be sought, if necessary, with respect to privilege or other confidential information related to production of electronic documents and data. Parties should be aware that legal precedents can be formed as a result of e-discovery practices and sanctions can be considered for a party's failure to meet their discovery obligations unless it can be demonstrated that the failure was not intentional. All parties must bear the “reasonable” costs associated with e-discovery but other arrangements can be agreed upon by the parties or by court order. == Caselaw == In Warman v. National Post Company proportionality was at issue in a case where the plaintiff was suing the defendant for libel. A motion was brought by the defendant to have the plaintiff provide a mirror image of his hard drive in an effort to prove an internet article was indeed authored by the plaintiff. Issues of proportionality and the work of the Sedona Conference and Sedona Canada Principles were factored in to the decision to grant the defendant only limited access to the hard drive. In Innovative Health Group Inc. v. Calgary Health Region the plaintiff's legal obligation to produce imaged hard drives is in question. Justice Conrad refers to the advice of Sedona Canada on proportionality and problems associated with time and expense related to the difficulties associated with electronically stored information. In York University v. Michael Markicevic Justice Brown specifically refers to the need for the parties to agree upon a formal e-discovery plan to be drafted in consultation with Sedona Canada Principles. In Friends of Lansdowne v. Ottawa Master MacLeod refers to the need for Sedona Canada principles and states “This is particularly true in the current information age when e-mail is ubiquitous and multiple copies or variants of messages may be held on various kinds of data storage devices including individual hard drives, e-mail and Blackberry servers. Even documents that ultimately exist in paper form normally begin their life on computers and negotiations frequently involve exchanges of electronic drafts. To find every scrap of paper and every electronic trace of relevant information has become a nightmarish task that threatens to render any kind of litigation extravagantly expensive.” == Criticism == Critics of the Sedona Canada Principles believe they should address system integrity and that the true history of any file preserved cannot be identified without proof of the integrity of the electronic record systems management it comes from. Other criticism is more directed to the Sedona Canada working group and complaints that it is insular and irrelevant.

    Read more →
  • Explore-then-commit algorithm

    Explore-then-commit algorithm

    Explore Then Commit (ETC) is an algorithm for the multi-armed bandit problem foc,used on finding the best trade-off between exploration and exploitation. == Multi-armed bandit problem == The multi-armed bandit problem is a sequential game where one player has to choose at each turn between K {\displaystyle K} actions (arms). Behind every arm a {\displaystyle a} is an unknown distribution ν a {\displaystyle \nu _{a}} that lies in a set D {\displaystyle {\mathcal {D}}} known by the player (for example, D {\displaystyle {\mathcal {D}}} can be the set of Gaussian distributions or Bernoulli distributions). At each turn t {\displaystyle t} the player chooses (pulls) an arm a t {\displaystyle a_{t}} , they then get an observation X t {\displaystyle X_{t}} of the distribution ν a t {\displaystyle \nu _{a_{t}}} . === Regret minimization === The goal is to minimize the regret at time T {\displaystyle T} that is defined as R T := ∑ a = 1 K Δ a E [ N a ( T ) ] {\displaystyle R_{T}:=\sum _{a=1}^{K}\Delta _{a}\mathbb {E} [N_{a}(T)]} where μ a := E [ ν a ] {\displaystyle \mu _{a}:=\mathbb {E} [\nu _{a}]} is the mean of arm a {\displaystyle a} μ ∗ := max a μ a {\displaystyle \mu ^{}:=\max _{a}\mu _{a}} is the highest mean Δ a := μ ∗ − μ a {\displaystyle \Delta _{a}:=\mu ^{}-\mu _{a}} N a ( t ) {\displaystyle N_{a}(t)} is the number of pulls of arm a {\displaystyle a} up to turn t {\displaystyle t} The player has to find an algorithm that chooses at each turn t {\displaystyle t} which arm to pull based on the previous actions and observations ( a s , X s ) s < t {\displaystyle (a_{s},X_{s})_{s Read more →