AI Content Creator Course Review

AI Content Creator Course Review — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Visual Expert

    Visual Expert

    Visual Expert is a static code analysis tool, extracting design and technical information from software source code by reverse-engineering, used by programmers for software maintenance, modernization or optimization. It is designed to parse several programming languages at the same time (PL/SQL, Transact-SQL, PowerBuilder...) and analyze cross-language dependencies, in addition to each language's source code. Visual Expert checks source code against hundreds of code inspection rules for vulnerability assessment, bug fix, and maintenance issues. == Features == Cross-references exploration: Impact Analysis, E/R diagrams, call graphs, CRUD matrix, dependency graphs. Software documentation: a documentation generator produces technical documentation and low-level design descriptions. Inspect the code to detect bugs, security vulnerabilities and maintainability issues. Native integration with Jenkins. Reports on duplicate code, unused objects and methods and naming conventions. Calculates software metrics and source lines of code. Code comparison: finds differences between several versions of the same code. Performance analysis: identifies code parts that slow down the application because of their syntax - it extracts statistics about code execution from the database and combines it with the static analysis of the code. == Usage == Visual Expert is used in several contexts: Change impact analysis: evaluating the consequences of a change in the code or in a database. Avoiding negative side effects when evolving a system. Static Application Security Testing (SAST): detecting and removing security issues. Continuous Integration / Continuous Inspection : adding a static code analysis job in a CI/CD workflow to automatically verify the quality and security of a new build when it is released. Program comprehension: helping programmers understand and maintain existing code, or modernize legacy systems. Transferring knowledge of the code, from one programmer to another. Software sizing: calculating the size of an application, or a piece of code, in order to estimate development efforts. Code review: improving the code by finding and removing code smells, dead code, code causing poor performances or violations of coding conventions. == Limitations == As a static code analyzer, Visual Expert is limited to the programming languages supported by its code parsers - Oracle PL/SQL, SQL Server Transact-SQL, PowerBuilder. A preliminary reverse engineering is required. Visual Expert does it automatically, but its duration depends on the size of the code parsed. Users must wait for the parsing completion prior to using the features, or schedule it in advance. They must also allocate sufficient hardware resources to support their volume of code. Visual Expert is based on a client/server architecture: the code analysis is running on a Windows PC - preferably a server. The information extracted from the code is stored in a RDBMS, communicating with a client application installed on the programmer's computer - no web client is available. This requires that the code, the parsers, the RDBMS and the programmers’ computers are connected to the same LAN or VPN. == History == 1995- 1998 - Prog and Doc - Initial version distributed on the French market 2001 - Visual Expert 4.5 2003 - Visual Expert 5 2007 - Visual Expert 5.7 2010 - Visual Expert 6.0 2015 - Visual Expert 2015 - Server component added to schedule code analyses 2016 - Visual Expert 2016 - Oracle PL/SQL code parser, code inventory (lines of code, number of objects…) 2017 - Visual Expert 2017 - SQL Server T-SQL code parser, Code comparison, CRUD matrix 2018 - Visual Expert 2018 - DB Code Performance Analysis, integration with TFS 2019 - Visual Expert 2019 - Generation of E/R diagrams from the code 2020 - Visual Expert 2020 - Object dependency matrix, naming consistency verification, integration with GIT and SVN 2021 - Visual Expert 2021 - Continuous Code Inspection, integration with Jenkins 2022 - Visual Expert 2022 - Support for cloud-based repositories and large volumes of code 2023 - Visual Expert 2023 - Performance tuning for PowerBuilder 2024 - Visual Expert 2024 - New web UI to simplify deployment and use among large teams. 2025 - Visual Expert 2025 - AI-based features to explain code, generate comments, and optimize queries

    Read more →
  • List of information schools

    List of information schools

    This list of information schools, sometimes abbreviated to iSchools, includes members of the iSchools organization. The iSchools organization reflects a consortium of over 130 information schools across the globe. == History == The first iSchools Caucus was formed in 1988 by Syracuse, Pittsburgh, and Drexel and was called the Gang of Three (sometimes gang of four with Rutgers). Syracuse renamed the School of Library Science as the School of Information Studies in 1974, and is considered as the first “iSchool” in history. The group was formally named "the iSchools Caucus" or more casually, the iCaucus. By 2003, the group expanded to include the Universities of Michigan, Washington, Illinois, UNC, Florida State, Indiana, and Texas, and was called the Gang of Ten. The current iSchools Caucus organization was formalized by 2005, with additions of UC Berkeley, UC Irvine, UCLA, Penn State, Georgia Tech, Maryland, Toronto, Carnegie Mellon and Singapore Management University. == iSchools organization == The iSchools promote an interdisciplinary approach to understanding the opportunities and challenges of information management, with a core commitment to concepts like universal access and user-centered organization of information. The field is concerned broadly with questions of design and preservation across information spaces, from digital and virtual spaces such as online communities, social networking, the World Wide Web, and databases to physical spaces such as libraries, museums, collections, and other repositories. "School of Information", "Department of Information Studies", or "Information Department" are often the names of the participating organizations. Degree programs at iSchools include course offerings in areas such as information architecture, design, policy, and economics; knowledge management, user experience design, and usability; preservation and conservation; librarianship and library administration; the sociology of information; and human-computer interaction and computer science. === Leadership === The executive committee of the iSchools is made up of the current chair (Ina Fourie, University of Pretoria, South Africa), past chair (Gillian Oliver, Monash University, Australia) and the chair elect (Javed Mostafa, University of Toronto Canada), plus representatives from the three regions (North America, Europe, and Asia-Pacific). The current executive director is Slava Sterzer. == Member institutions == Between 2010 and 2026, the organization expanded globally beyond North America, growing to 133 member schools as of March 2026. For an updated and complete list of member schools, please visit the member database of the iSchools. == iConferences == Members of the iSchools organize a regular academic conference, known as the iConference, hosted by a different member institution each year. September 2005: Pennsylvania State University October 2006: University of Michigan February 2008: University of California, Los Angeles February 2009: University of North Carolina February 2010: University of Illinois at Urbana-Champaign February 2011: University of Washington, Seattle February 2012: University of Toronto February 2013: University of North Texas March 2014: Humboldt-Universität zu Berlin March 2015: University of California, Irvine March 2016: Drexel University March 2017: Wuhan University March 2018: University of Sheffield and Northumbria University March 2019: University of Maryland March 2020: University of Borås (virtual only) March 2021: Renmin University of China (virtual only) February/March 2022: University of Texas at Austin, University College Dublin & Kyushu University (virtual only) March 2023: Universitat Oberta de Catalunya March 2024: Jilin University March 2025: Indiana University March/April 2026: Edinburgh Napier University 2027: Victoria University of Wellington == Other schools of information == Other information schools and programs include: Documentation Research and Training Centre, Indian Statistical Institute, Bangalore San Jose State University, School of Information University of Southern California Library Science Degree Ankara University, Department of Information and Records Management, Ankara/Turkey Marmara University, Department of Information and Records Management, Istanbul/Turkey University of Kelaniya, Department of Library and Information Science, Kelaniya/Sri Lanka University of Colombo, National Institute of Library and Information Science (NILIS), Colombo/Sri Lanka Chicago State University, Department of Information Studies

    Read more →
  • Generalized distributive law

    Generalized distributive law

    The generalized distributive law (GDL) is a generalization of the distributive property which gives rise to a general message passing algorithm. It is a synthesis of the work of many authors in the information theory, digital communications, signal processing, statistics, and artificial intelligence communities. The law and algorithm were introduced in a semi-tutorial by Srinivas M. Aji and Robert J. McEliece with the same title. == Introduction == "The distributive law in mathematics is the law relating the operations of multiplication and addition, stated symbolically, a ∗ ( b + c ) = a ∗ b + a ∗ c {\displaystyle a(b+c)=ab+ac} ; that is, the monomial factor a {\displaystyle a} is distributed, or separately applied, to each term of the binomial factor b + c {\displaystyle b+c} , resulting in the product a ∗ b + a ∗ c {\displaystyle ab+ac} " – Britannica. As it can be observed from the definition, application of distributive law to an arithmetic expression reduces the number of operations in it. In the previous example the total number of operations reduced from three (two multiplications and an addition in a ∗ b + a ∗ c {\displaystyle ab+ac} ) to two (one multiplication and one addition in a ∗ ( b + c ) {\displaystyle a(b+c)} ). Generalization of distributive law leads to a large family of fast algorithms. This includes the FFT and Viterbi algorithm. This is explained in a more formal way in the example below: α ( a , b ) = d e f ∑ c , d , e ∈ A f ( a , c , b ) g ( a , d , e ) {\displaystyle \alpha (a,\,b){\stackrel {\mathrm {def} }{=}}\displaystyle \sum \limits _{c,d,e\in A}f(a,\,c,\,b)\,g(a,\,d,\,e)} where f ( ⋅ ) {\displaystyle f(\cdot )} and g ( ⋅ ) {\displaystyle g(\cdot )} are real-valued functions, a , b , c , d , e ∈ A {\displaystyle a,b,c,d,e\in A} and | A | = q {\displaystyle |A|=q} (say) Here we are "marginalizing out" the independent variables ( c {\displaystyle c} , d {\displaystyle d} , and e {\displaystyle e} ) to obtain the result. When we are calculating the computational complexity, we can see that for each q 2 {\displaystyle q^{2}} pairs of ( a , b ) {\displaystyle (a,b)} , there are q 3 {\displaystyle q^{3}} terms due to the triplet ( c , d , e ) {\displaystyle (c,d,e)} which needs to take part in the evaluation of α ( a , b ) {\displaystyle \alpha (a,\,b)} with each step having one addition and one multiplication. Therefore, the total number of computations needed is 2 ⋅ q 2 ⋅ q 3 = 2 q 5 {\displaystyle 2\cdot q^{2}\cdot q^{3}=2q^{5}} . Hence the asymptotic complexity of the above function is O ( n 5 ) {\displaystyle O(n^{5})} . If we apply the distributive law to the RHS of the equation, we get the following: α ( a , b ) = d e f ∑ c ∈ A f ( a , c , b ) ⋅ ∑ d , e ∈ A g ( a , d , e ) {\displaystyle \alpha (a,\,b){\stackrel {\mathrm {def} }{=}}\displaystyle \sum \limits _{c\in A}f(a,\,c,\,b)\cdot \sum _{d,\,e\in A}g(a,\,d,\,e)} This implies that α ( a , b ) {\displaystyle \alpha (a,\,b)} can be described as a product α 1 ( a , b ) ⋅ α 2 ( a ) {\displaystyle \alpha _{1}(a,\,b)\cdot \alpha _{2}(a)} where α 1 ( a , b ) = d e f ∑ c ∈ A f ( a , c , b ) {\displaystyle \alpha _{1}(a,b){\stackrel {\mathrm {def} }{=}}\displaystyle \sum \limits _{c\in A}f(a,\,c,\,b)} and α 2 ( a ) = d e f ∑ d , e ∈ A g ( a , d , e ) {\displaystyle \alpha _{2}(a){\stackrel {\mathrm {def} }{=}}\displaystyle \sum \limits _{d,\,e\in A}g(a,\,d,\,e)} Now, when we are calculating the computational complexity, we can see that there are q 3 {\displaystyle q^{3}} additions in α 1 ( a , b ) {\displaystyle \alpha _{1}(a,\,b)} and α 2 ( a ) {\displaystyle \alpha _{2}(a)} each and there are q 2 {\displaystyle q^{2}} multiplications when we are using the product α 1 ( a , b ) ⋅ α 2 ( a ) {\displaystyle \alpha _{1}(a,\,b)\cdot \alpha _{2}(a)} to evaluate α ( a , b ) {\displaystyle \alpha (a,\,b)} . Therefore, the total number of computations needed is q 3 + q 3 + q 2 = 2 q 3 + q 2 {\displaystyle q^{3}+q^{3}+q^{2}=2q^{3}+q^{2}} . Hence the asymptotic complexity of calculating α ( a , b ) {\displaystyle \alpha (a,b)} reduces to O ( n 3 ) {\displaystyle O(n^{3})} from O ( n 5 ) {\displaystyle O(n^{5})} . This shows by an example that applying distributive law reduces the computational complexity which is one of the good features of a "fast algorithm". == History == Some of the problems that used distributive law to solve can be grouped as follows: Decoding algorithms: A GDL like algorithm was used by Gallager's for decoding low density parity-check codes. Based on Gallager's work Tanner introduced the Tanner graph and expressed Gallagers work in message passing form. The tanners graph also helped explain the Viterbi algorithm. It is observed by Forney that Viterbi's maximum likelihood decoding of convolutional codes also used algorithms of GDL-like generality. Forward–backward algorithm: The forward backward algorithm helped as an algorithm for tracking the states in the Markov chain. And this also was used the algorithm of GDL like generality Artificial intelligence: The notion of junction trees has been used to solve many problems in AI. Also the concept of bucket elimination used many of the concepts. == The MPF problem == MPF or marginalize a product function is a general computational problem which as special case includes many classical problems such as computation of discrete Hadamard transform, maximum likelihood decoding of a linear code over a memory-less channel, and matrix chain multiplication. The power of the GDL lies in the fact that it applies to situations in which additions and multiplications are generalized. A commutative semiring is a good framework for explaining this behavior. It is defined over a set K {\displaystyle K} with operators " + {\displaystyle +} " and " . {\displaystyle .} " where ( K , + ) {\displaystyle (K,\,+)} and ( K , . ) {\displaystyle (K,\,.)} are a commutative monoids and the distributive law holds. Let p 1 , … , p n {\displaystyle p_{1},\ldots ,p_{n}} be variables such that p 1 ∈ A 1 , … , p n ∈ A n {\displaystyle p_{1}\in A_{1},\ldots ,p_{n}\in A_{n}} where A {\displaystyle A} is a finite set and | A i | = q i {\displaystyle |A_{i}|=q_{i}} . Here i = 1 , … , n {\displaystyle i=1,\ldots ,n} . If S = { i 1 , … , i r } {\displaystyle S=\{i_{1},\ldots ,i_{r}\}} and S ⊂ { 1 , … , n } {\displaystyle S\,\subset \{1,\ldots ,n\}} , let A S = A i 1 × ⋯ × A i r {\displaystyle A_{S}=A_{i_{1}}\times \cdots \times A_{i_{r}}} , p S = ( p i 1 , … , p i r ) {\displaystyle p_{S}=(p_{i_{1}},\ldots ,p_{i_{r}})} , q S = | A S | {\displaystyle q_{S}=|A_{S}|} , A = A 1 × ⋯ × A n {\displaystyle \mathbf {A} =A_{1}\times \cdots \times A_{n}} , and p = { p 1 , … , p n } {\displaystyle \mathbf {p} =\{p_{1},\ldots ,p_{n}\}} Let S = { S j } j = 1 M {\displaystyle S=\{S_{j}\}_{j=1}^{M}} where S j ⊂ { 1 , . . . , n } {\displaystyle S_{j}\subset \{1,...\,,n\}} . Suppose a function is defined as α i : A S i → R {\displaystyle \alpha _{i}:A_{S_{i}}\rightarrow R} , where R {\displaystyle R} is a commutative semiring. Also, p S i {\displaystyle p_{S_{i}}} are named the local domains and α i {\displaystyle \alpha _{i}} as the local kernels. Now the global kernel β : A → R {\displaystyle \beta :\mathbf {A} \rightarrow R} is defined as: β ( p 1 , . . . , p n ) = ∏ i = 1 M α ( p S i ) {\displaystyle \beta (p_{1},...\,,p_{n})=\prod _{i=1}^{M}\alpha (p_{S_{i}})} Definition of MPF problem: For one or more indices i = 1 , . . . , M {\displaystyle i=1,...\,,M} , compute a table of the values of S i {\displaystyle S_{i}} -marginalization of the global kernel β {\displaystyle \beta } , which is the function β i : A S i → R {\displaystyle \beta _{i}:A_{S_{i}}\rightarrow R} defined as β i ( p S i ) = ∑ p S i c ∈ A S i c β ( p ) {\displaystyle \beta _{i}(p_{S_{i}})\,=\displaystyle \sum \limits _{p_{S_{i}^{c}}\in A_{S_{i}^{c}}}\beta (p)} Here S i c {\displaystyle S_{i}^{c}} is the complement of S i {\displaystyle S_{i}} with respect to { 1 , . . . , n } {\displaystyle \mathbf {\{} 1,...\,,n\}} and the β i ( p S i ) {\displaystyle \beta _{i}(p_{S_{i}})} is called the i t h {\displaystyle i^{th}} objective function, or the objective function at S i {\displaystyle S_{i}} . It can observed that the computation of the i t h {\displaystyle i^{th}} objective function in the obvious way needs M q 1 q 2 q 3 ⋯ q n {\displaystyle Mq_{1}q_{2}q_{3}\cdots q_{n}} operations. This is because there are q 1 q 2 ⋯ q n {\displaystyle q_{1}q_{2}\cdots q_{n}} additions and ( M − 1 ) q 1 q 2 . . . q n {\displaystyle (M-1)q_{1}q_{2}...q_{n}} multiplications needed in the computation of the i th {\displaystyle i^{\text{th}}} objective function. The GDL algorithm which is explained in the next section can reduce this computational complexity. The following is an example of the MPF problem. Let p 1 , p 2 , p 3 , p 4 , {\displaystyle p_{1},\,p_{2},\,p_{3},\,p_{4},} and p 5 {\displaystyle p_{5}} be variables such that p 1 ∈ A 1 , p 2 ∈ A 2 , p 3 ∈ A 3 , p 4 ∈ A 4 , {\displaystyle p_{1}\in

    Read more →
  • Small Data

    Small Data

    Small Data: the Tiny Clues that Uncover Huge Trends is Martin Lindstrom's seventh book. It chronicles his work as a branding expert, working with consumers across the world to better understand their behavior. The theory behind the book is that businesses can better create products and services based on observing consumer behavior in their homes, as opposed to relying solely on big data. == Content == The book is based on a several year period of consumer studies for major corporations across the globe. It features case studies of the author's work interviewing consumers in their homes and using his observations to create hypotheses as to why they use products the way that they do. == Public reception == The book was a New York Times Bestseller upon release and was positively reviewed on several websites, Including Entrepreneur and Forbes. In 2016, it was named a Best Business Book by strategy+business and one of Inc. Magazine's Best Sales and Marketing books.

    Read more →
  • Amaq News Agency

    Amaq News Agency

    Amaq News Agency (Arabic: وكالة أعماق الإخبارية, romanized: Wakālat Aʻmāq al-Ikhbārīyah) is a news outlet linked to the Islamic State (IS). Amaq is often the "first point of publication for claims of responsibility" for terrorist attacks in Western countries by the Islamic State. In March 2019, Amaq News Agency was designated as a foreign terrorist organization by the United States Department of State. == History == Among the founders of Amaq was Syrian journalist Baraa Kadek, who joined IS in late 2013, Abu Muhammad al-Furqan, and seven others who originally worked for Halab News Network. According to The New York Times, it has a direct connection with IS, from which it "gets tips". Its name was taken from Amik Valley in Hatay Province, which is mentioned in a hadith as the site of an "apocalyptic victory over non-believers". Amaq News Agency was first noticed by SITE during the Siege of Kobanî (Syria) in 2014, when its updates were shared among IS fighters. It became more widely known after it began reporting claims of responsibility for terrorist attacks in Western countries, such as the 2015 San Bernardino attack, for which IS officially claimed responsibility the next day. An Amaq cameraman shot the first footage of the capture of Palmyra in 2015. Amaq launched an official mobile app in 2015 and has warned against unofficial versions that reportedly have been used to spy on its users. It also uses a Telegram account. It had a WordPress-based blog, but it was removed without explanation in April 2016. On 12 June 2016, IS claimed responsibility for the Pulse nightclub shooting through Amaq, without prior knowledge of the attack. The shooter, Omar Mateen had later pledged allegiance to IS via a phone call with emergency services. On 31 May 2017, a Facebook post announced Amaq's founder, Baraa Kadek AKA Rayan Meshaal, had been killed with his daughter by an American airstrike on Mayadin. The post was reportedly made by his younger brother. Reuters could not immediately verify this account. On 27 July 2017, the US confirmed that Kadek had been killed by a coalition airstrike near Mayadin between 25 and 27 May 2017. In June 2017, German police arrested a 23-year-old Syrian man identified only as Mohammed G., accusing him of communicating with the alleged perpetrator of the 2016 Malmö Muslim community centre arson in order to report to Amaq. On 21 March 2019, the U.S. Department of State officially deemed Amaq an alias of IS, and thus a Foreign Terrorist Organization. On 22 March 2024, the Islamic State claimed responsibility for the Crocus City Hall attack through Amaq, U.S. officials confirmed the claim shortly after. A day after the attack, Amaq published a video of the attack, filmed by one of the attackers. It showed the attackers shooting victims and slitting the throat of another, while the filming attacker praises Allah and speaks against infidels. == Character == Amaq publishes a stream of short news reports, both text and video, on the mobile app Telegram. The reports take on the trappings of mainstream journalism, with "Breaking News" headings, and embedded reporters at the scenes of IS battles. The reports try to appear neutral, toning down the jihadist language and sectarian slurs IS uses in its official releases. Charlie Winter of the Transcultural Conflict and Violence Initiative at Georgia State University, and Rita Katz of SITE Intelligence Group in Washington say Amaq functions much like the state-owned news agency of IS, though the group does not acknowledge it as such. Katz said it behaves "like a state media". Amaq appears to have been allowed to develop by IS as a way to have a news outlet that is controlled by the group but is somewhat removed from it, giving IS more of the appearance of legitimacy. == Reliability == According to Rukmini Callimachi in The New York Times: "Despite a widespread view that the Islamic State opportunistically claims attacks with which it has little genuine connection, its track record—minus a handful of exceptions—suggests a more rigorous protocol. At times, the Islamic State has got details wrong, or inflated casualty figures, but the gist of its claims is typically correct." According to Callimachi, the group considers itself responsible for acts carried out by people who were inspired by its propaganda, as well as acts carried out by its own personnel and in some instances, had claimed attacks before the identities of the killers were known. Graeme Wood writing in The Atlantic in October 2017, wrote "The idea that the Islamic State simply scans the news in search of mass killings, then sends out press releases in hope of stealing glory, is false. Amaq may learn details of the attacks from mainstream media ... but its claim of credit typically flows from an Amaq-specific source." An October 2017 article in The Hill, points to two false claims made in the summer of 2017, the Resorts World Manila attack and a false claim that bombs had been planted at Charles de Gaulle Airport in Paris. Also, a claimed IS connection to the 2017 Las Vegas shooting proved to be false. According to Rita Katz on the SITE Intelligence Group website, calling a terrorist a "soldier of the caliphate (warrior from the caliphate)" in a statement issued by Amaq, was the usual way in which IS indicated that it inspired an attack. Centrally coordinated attacks were usually described as "executed by a detachment belonging to the Islamic State", and were often announced by both Amaq and by IS' central media command. == Online presence == In November 2019, Belgian police said they had carried out a successful cyberattack on Amaq, thus leaving IS without an operational communication channel. However, Amaq has since regained online presence, primarily on dark web platforms to make it harder for law enforcement to take them down without physical access to the server hosting the specific platform.

    Read more →
  • Automatic image annotation

    Automatic image annotation

    Automatic image annotation (also known as automatic image tagging or linguistic indexing) is the process by which a computer system automatically assigns metadata in the form of captioning or keywords to a digital image. This application of computer vision techniques is used in image retrieval systems to organize and locate images of interest from a database. This method can be regarded as a type of multi-class image classification with a very large number of classes - as large as the vocabulary size. Typically, image analysis in the form of extracted feature vectors and the training annotation words are used by machine learning techniques to attempt to automatically apply annotations to new images. The first methods learned the correlations between image features and training annotations. Subsequently, techniques were developed using machine translation to attempt to translate the textual vocabulary into the 'visual vocabulary,' represented by clustered regions known as blobs. Subsequent work has included classification approaches, relevance models, and other related methods. The advantages of automatic image annotation versus content-based image retrieval (CBIR) are that queries can be more naturally specified by the user. At present, Content-Based Image Retrieval (CBIR) generally requires users to search by image concepts such as color and texture or by finding example queries. However, certain image features in example images may override the concept that the user is truly focusing on. Traditional methods of image retrieval, such as those used by libraries, have relied on manually annotated images, which is expensive and time-consuming, especially given the large and constantly growing image databases in existence.

    Read more →
  • Ecoinformatics

    Ecoinformatics

    Ecoinformatics, or ecological informatics, is the science of information in ecology and environmental science. It integrates environmental and information sciences to define entities and natural processes with language common to both humans and computers. However, this is a rapidly developing area in ecology and there are alternative perspectives on what constitutes ecoinformatics. A few definitions have been circulating, mostly centered on the creation of tools to access and analyze natural system data. However, the scope and aims of ecoinformatics are certainly broader than the development of metadata standards to be used in documenting datasets. Ecoinformatics aims to facilitate environmental research and management by developing ways to access, integrate databases of environmental information, and develop new algorithms enabling different environmental datasets to be combined to test ecological hypotheses. Ecoinformatics is related to the concept of ecosystem services. Ecoinformatics characterize the semantics of natural system knowledge. For this reason, much of today's ecoinformatics research relates to the branch of computer science known as knowledge representation, and active ecoinformatics projects are developing links to activities such as the Semantic Web. Current initiatives to effectively manage, share, and reuse ecological data are indicative of the increasing importance of fields like ecoinformatics to develop the foundations for effectively managing ecological information. Examples of these initiatives are National Science Foundation Datanet projects, DataONE, Data Conservancy, and Artificial Intelligence for Environment & Sustainability. == Software Development Lifecycle == Central to the concept of ecoinformatics is the Software Development Lifecycle (SDLC), a systematic framework for writing, implementing, and maintaining software products. Typically in Ecoinformatics projects, the development pipeline includes data collection, usually from several different environmental data sources, then integrating these data sources together, and then analyzing the data. Here, each step of the SDLC is described in the context of ecoinformatics, per Michener et al. It is important to note that the plan, collect, assure, describes and preserve steps refer to the data collection entity, which can be individual researchers or large data-collection networks, while the discover, integrate, and analyze steps typically refer to the individual researcher. Plan: Ecoinformatics projects require data from several databases. Each database holds different data, and therefore researchers should identify what types of environmental or ecological data they will need to answer their research question. Collect: Data is collected in several different ways. In ecoinformatics, this is usually restricted to manually entering data into a spreadsheet, and parsing data from an existing database. The growth of relational databases has made it easier for ecologists to download relevant data and integrate datasets together Assure: Data entries should be checked thoroughly to validate their accuracy and usability, such as to check for outliers and erroneous points. The same principle applies to data downloaded from datasets. This responsibility falls on both the ecologist downloading the data, and the entity that sets up the data collection system. Describe: An accurate description of the metadata of a dataset that is used in a study should include enough information to deduce the data collection and processing methodology, when the data were collected, why the data were collected, and how the data were stored. This is important for reproducibility, especially for projects that build on each other and may recycle data Preserve: After data is collected by an institutional entity, it should be archived such that it is easily accessible. Ideally, this is in databases that are maintained and not at risk of deprecation Discover: While there are good practices for discovering data to start a research project, this process is often marred by a lack of usable, published data, as researchers may collect data specific to their study, but may not publish this data for wider use. On the data collection end, this can be addressed by better data-sharing practices, such as by linking datasets when publishing papers or studies. On the data procurement end, this can be addressed by more precise data searching, such as using key words to find relevant datasets. Integrate: Synthesizing datasets together can be difficult and labor-intensive, largely due to the methodological differences in data collection. There are several approaches to this, but the best practices typically involve computational approaches, namely using R or Python, to automate the processes and prevent errors Analyze: Data analysis can take several forms, and should be tailored to the specific ecological project. However, all data analysis methods should be well-documented, including the procedure for analysis, justification for analysis methods, and any shortcomings in a specific approach. == Applications of Ecoinformatics Across Ecology == === Ecosystem Ecology === Source: Ecosystem studies, by definition, encompass interactions across the entire life sciences spectrum, from microscopic biochemical reactions to large-scale geological phenomena. As a result, big databases may not be designed specifically for any particular research question, but should be inclusive enough to support most studies. Since ecosystem-level questions require a broad perspective, data-related ecosystem projects would likely incorporate data from several databases. A common framework for incorporating data into ecosystem-level studies is the network science model, in which data collection mechanisms and resources are treated like a large, interconnected network instead of individual entities. The network may include several data collection stations within one databases, or may span across multiple databases. Currently there are several large-scale networks, but they do not generate data on the scale to consider ecology as a big data science. A current challenge for ecoinformatics in ecosystem ecology is that most funding is prioritized for generating new data rather than maintaining existing data infrastructures. Integrating data across the different spatial scales can also be difficult, since each dataset may hold different types of data. === Urban Ecology === Source: The current push for smart cities, and sensor network integration into infrastructure, has positioned as a major source of data for ecological studies. Typical urban ecology questions address the effects of urbanization on the local ecosystem, and how to drive future development to promote urban biodiversity. While sensor networks in cities typically collect environmental data to optimize city processes, they may also be used for ecological initiatives, especially with respect to understanding the complex, multi-layered relationship between cities and their local ecosystem. It can also be used to better understand the current landscape of cities, and identify avenues for rewinding of cities. For example, analyzing mobility patterns can identify areas that may lend themselves well to building parks and green spaces. Bird watching data can also be used to identify the types of bird species in a local area. === Infectious Disease === Source: Like other disciplines of ecology, emerging infectious disease and epidemiology span multiple scales, from understanding the genetics that drive disease trends to large-scale spatiotemporal analyses. As a result, infectious disease studies can incorporate everything from bioinformatics, genetic sequences, amino acid sequences, and environmental observation data. On the micro-scale, these data can then be used to predict infectivity/transmissibility, drug resistance, drug candidates, and mutation sites. On the macro-scale, it can be used to identify societal trends or environmental factors that lend themselves to spillover, locations of infection, and practices that cause disease transmission. == Databases == Source: USGS National Streamflow sensor network GBIF Neotoma Paleobiology database European Vegetation Archive USDA Forest Inventory Analysis TRY BIEN AmeriFlux TEAM iNaturalist NEON GLEON LTER CZO TERN SAEON

    Read more →
  • Social information architecture

    Social information architecture

    Social information architecture, also known as social iA, is a sub-domain of information architecture which deals with the social aspects of conceptualizing, modeling and organizing information. It has become more relevant because of the rise of social media and Web 2.0 in recent times. == Approach == There are different approaches to the explanation of social information architecture. === Architecture model (internal space) === Architects designing a physical community space, have to consider how the architecture will shape social interactions. A long hallway of offices creates an utterly different dynamic than desks with arranged in an open space. One might foster individuality, privacy, propriety; the other: collaboration, distraction, communalism. Still, physical spaces can be flexibly repurposed and worked around if the inhabitants desire a social dynamic not instantly afforded by the space. Office doors can be left open to invite easier interaction. Partitions can be raised between adjacent desks to limit distraction and increase privacy. That's physical architecture. The information architectures of online communities are far more deterministic and far less flexible. They literally define the social architecture by pre-specifying in immutable computer code what information you have access to, who you can talk to, where you can go. In the online world, information architecture = social architecture. === Social dialogue and information model (external space) === All major brands use information architecture to market their products online, it is then commonly wrapped under the umbrella phrase 'digital strategy'. Information architecture used for strategic purposes encompasses brand SEO, strategic placement of virals, social media presence etc. Charities, news outlets and social dialogue forums can make a much more specific use of the same tools for positive and important social purposes. Social Information Architecture is perceived as the socially conscious wing of commercial information architecture and function to exchange information and ideas between people and groups. Social iA can pick up on conflicting issues that are treated with misunderstanding between cultures and leaves individuals and societies vulnerable to exploitation and manipulation. Since the net has such a far reach it is obvious to use it for meaningful and coordinated social dialogue. Example of such issues are faith, environment, politics, climate change, war, injustice and other social challenges. Information architecture can help create frameworks in which sharing information brings people together, inspires and encourages them to participate in a forward thinking and unfragmented way. One of its core activities is to spread messages that bring people from opposite sites of social and cultural spectrums together and to confront uncomfortable subject head on. == How does social information architecture work? == Social iA utilizes a variety of Web2.0 applications to filter relevant or valuable information and weave them in appropriate information repository or provide feedback to interesting channels. Social iA makes strategic use of Search Engines, Social Media, Google Algorithms, as well as websites, video & news channels. It ‘reads’ or 'listens' to social conversations and search engine queries and engages with the net actively to gather clues about the world's pulse on the internet. It assesses data, social & political trends, and respond with targeted campaigns to give people ideas, as well as help people with making sense of information. == Principals == Dan Brown in his paper 8 Principals of Social Information Architecture enlists the following principals: 1. The principle of objects: Treat content as a living, breathing thing, with a lifecycle, behaviors and attributes. 2. The principle of choices: Create pages that offer meaningful choices to users, keeping the range of choices available focused on a particular task. 3. The principle of disclosure: Show only enough information to help people understand what kinds of information they'll find as they dig deeper. 4. The principle of exemplars: Describe the contents of categories by showing examples of the contents. 5. The principle of front doors: Assume at least half of the website's visitors will come through some page other than the home page. 6. The principle of multiple classification: Offer users several different classification schemes to browse the site's content. 7. The principle of focused navigation: Don't mix apples and oranges in your navigation scheme. 8. The principle of growth: Assume the content you have today is a small fraction of the content you will have tomorrow. == What can social information architecture achieve? == Social information architecture has many potentials in terms of fostering social connections and how information is shared in social spaces on the web.

    Read more →
  • Reverse correlation technique

    Reverse correlation technique

    The reverse correlation technique is a data driven study method used primarily in psychological and neurophysiological research. This method earned its name from its origins in neurophysiology, where cross-correlations between white noise stimuli and sparsely occurring neuronal spikes could be computed quicker when only computing it for segments preceding the spikes. The term has since been adopted in psychological experiments that usually do not analyze the temporal dimension, but also present noise to human participants. In contrast to the original meaning, the term is here thought to reflect that the standard psychological practice of presenting stimuli of defined categories to the participants is "reversed": Instead, the participant's mental representations of categories are estimated from interactions of the presented noise and the behavioral responses. It is used to create composite pictures of individual and/or group mental representations of various items (e.g. faces, bodies, and the self) that depict characteristics of said items (e.g. trustworthiness and self-body image). This technique is helpful when evaluating the mental representations of those with and without mental illnesses. == Terms == This technique utilizes spike-triggered average to explain what areas of signal and noise in an image are valuable for the given research question. Signal is information used to produce objects of value that help explain and connect the world around us. Noise is commonly referred to as unwanted signal that obscures the information that the signal is trying to present. Most importantly for reverse correlation studies, noise is randomly varying information. To determine the areas of importance using reverse correlation, noise is applied to a base image and then evaluated by observers. A base image is any image void of noise that relates to the research question. A base image that has noise superimposed on top is the stimuli that is presented to and evaluated by participants. Each time a new set of stimuli is presented to a participant, this is known as a trial. After a participant has responded to hundreds to thousands of trials, a researcher is ready to create a classification image. A classification image (abbreviated as "CI" in some studies) is a single image that represents the average noise patterns in the images selected by participants. A classification image can also be computed for groups by averaging the individuals’ classification images. These classification images are what researchers use to interpret the data and draw conclusions. As a whole, the reverse correlation method is a process that results in a composite image (from an individual or group) that can be used to estimate and interpret mental representations. == Basic study layout == The reverse correlation method is typically executed as an in-lab computer experiment. This method follows four broad steps. Each of the following steps are described in greater detail below. After creating a research question and determining that the reverse correlation method is the most suitable technique to answer the question, a researcher must (1) design randomly varying stimuli. After the stimuli have been prepared, a researcher should (2) collect data from participants who will see and respond to approximately 300 -1,000 trials. Each trial will either consist of one or two images (side by side) derived from the same base image with noise superimposed on top. Participant responses will depend on the chosen study design; if a researcher presents only one image at a time, participants rate the image on a 4pt scale, but when two images are shown, the participant is asked to choose which best aligns with the given category (e.g. choose the image that looks the most aggressive). Once all of the data is collected, the researcher will (3) compute classification images for each participant and using those images compute group classification images. Finally, with the classification images available, the researcher will (4) evaluate the images and draw conclusions about their results. === Step 1: making stimuli === When designing the stimuli for a reverse correlation study, the two primary factors that one should consider are (1) the base image and (2) the noise that will be used. While not all bases are images per se, the majority are and for this reason the base is typically referred to as a base image. The base image should represent whatever the research question is addressing. For example, if you are interested in peoples’ mental representations of Chinese people, it would not make sense to use a base image of a Spanish or Caucasian person. Again, if you are interested in the mental representations of male vocal patterns, it would make the most sense to use a base vocal pattern that has been produced by a male. Having a base is important because it provides a kind of anchor for participants to work from. When there is no base image, the number of trials that are required increases dramatically, thus making it harder to collect data. While there are studies that have excluded a base image, (e.g. the S study), for more elaborate and nuanced research questions, it is important to have a base image that is a fair representation of what participants are being asked to categorize. Photographs of faces are generally the most popular base image. Although the reverse correlation method is capable of investigating a wide variety of research questions, the most common application of the method is for evaluating faces on a single trait. Reverse correlation studies that address evaluations of the face are sometimes referred to as being a face space reverse correlation model (FSRCM). Thankfully, there are existing databases for face images of varying demographics and emotion that work well as base images. The reverse correlation method can also be used to help researchers identify what areas of an image (e.g. the areas on the face) have diagnostic value. In order to identify these areas of value, researchers start by minimizing the space a participant can pull information from. By imposing a “mask” on an image (e.g. blur an image while leaving random areas un-blurred), this reduces the information individuals might see, and forces them to focus on certain areas. Then, if/when participants are able to correctly identify an image with a trait repeatedly, we can draw conclusions about what areas have diagnostic value. While faces and visual stimuli are the most popular, this is not the only stimuli that can be used in a reverse correlation study. This method was originally designed for auditory stimuli which allows researchers to investigate how perceivers interpret auditory information and create trait based attributions to different sound patterns. For example, by segmenting a vocal recording of a single word (total sound time 426 ms) into six segments (71 ms each), and varying each segment's pitch using Gaussian distributions, researchers were able to uncover what vocal patterns people associated with certain traits. Specifically, this study investigated how listeners rated sound clips of the word “really” as sounding more interrogative (i.e. like the more common reverse correlation studies this study had participants listen to two sound clips per trial, choose which fit the category the best, and then created an average of the pitch contours). Beyond face and auditory perception, research utilizing the reverse correlation method has expanded to investigate how individuals see three-dimensional objects in images with noise (but no signal). After selecting your base image, regardless of what the image is, it is helpful to apply a Gaussian blur to smooth noise in the image. While noise will be applied later, it is helpful to reduce existing noise in the photo before applying your chosen noise. There are three primary choices when it comes to noise: white noise, sine-wave noise, and Gabor noise. The latter two of these constrain the configurations that the noise can have, and because of this white noise is usually the most commonly used. Regardless of the type of noise that is chosen, it is crucial that the noise randomly varies. === Step 2: data collection === Once the stimuli for the study has been developed, the researcher must make a few decisions before actually collecting the data. The researcher must come to a conclusion on how many stimuli will be presented at a time and how many trials the participants will see. In terms of stimuli presentation, a researcher can choose from either a 2-Image Forced Choice (2IFC) or a 4-Alternative Forced Choice (4AFC). The 2IFC presents two images at once (side by side) and requires participants to choose between the two on a specified category (e.g. which image looks the most like a male). Typically the noise from the left image is the mathematical inverse of the noise from the right image. This method was developed to better answer questions that could n

    Read more →
  • QuickPar

    QuickPar

    QuickPar is a computer program that creates parchives used as verification and recovery information for a file or group of files, and uses the recovery information, if available, to attempt to reconstruct the originals from the damaged files and the PAR volumes. Designed for the Microsoft Windows operating system, in the past it was often used to recover damaged or missing files that have been downloaded through Usenet. QuickPar may also be used under Linux via Wine. There are two main versions of PAR files: PAR and PAR2. The PAR2 file format lifts many of its previous restrictions. QuickPar is freeware but not open-source. It uses the Reed-Solomon error correction algorithm internally to create the error correcting information. == Replacement == Since QuickPar hasn't been updated in 21 years, it is considered abandonware. Currently, MultiPar is accepted as the software that replaces QuickPar. MultiPar is actively being developed by Yutaka Sawada. == 64-bit versions == At present the command line version of QuickPar for Linux command line is available as a 64-bit version. None of the GUI versions available presently offer a 64-bit version.

    Read more →
  • Manufacture Modules Technologies

    Manufacture Modules Technologies

    Manufacture Modules Technologies Sarl (MMT) is a Swiss company established in Geneva in 2015 which originally specialised in the development and commercialization of "Horological Smartwatch modules", firmware, apps and cloud. Located at Geneva's Skylab high-tech hub, it expanded into the development and manufacturing of "E-Straps" operated with a mobile application. Philippe Fraboulet is the CEO. == History == In June 2015, Fullpower Technologies and Union Horlogère Suisse (Swiss Watchmakers Corporation) formed MMT as a joint venture, which then launched the MotionX Horological Smartwatch Open Platform for the Swiss watch industry. The initial licensees were Frederique Constant, Alpina and Mondaine, brands owned by Union Horlogère Suisse. Fullpower created and managed the circuit design, firmware, smartphone applications (including sleep activity), as well as the cloud Infrastructure. MMT managed the Swiss watch movement development and production as well as licensing and support. In July 2016, Union Horlogere Holding and MMT were spun-out of the Frédérique Constant Group. Fullpower Technologies' 19.99% share was acquired by Union Horlogere Holding BV, giving it 100% of MMT's shares. == Business == The company offers firmware, a cloud, manufacturing, service and over-the-air facilities for upgrades. The company also offers its own apps, which bear the label “Swiss Made software”.

    Read more →
  • Information school

    Information school

    Information school (sometimes abbreviated I-school or iSchool) is a university-level institution committed to understanding the role of information in nature and human endeavors. Synonyms include school of information, department of information studies, or information department. Information schools faculty conduct research into the fundamental aspects of information and related technologies. In addition to granting academic degrees, information schools educate information professionals, researchers, and scholars for an increasingly information-driven world. Information school can also refer, in a more restricted sense, to the members of the iSchools organization (formerly the "iSchools Project"), as governed by the iCaucus. Members of this group share a fundamental interest in the relationships between people, information, technology, and science. These schools, colleges, and departments have been either newly established or have evolved from programs focused on information systems, library science, informatics, computer science, library and information science and information science. Information schools promote an interdisciplinary approach to understanding the opportunities and challenges of information management, with a core commitment to concepts like universal access and user-centered organization of information. The field is concerned broadly with questions of design and preservation across information spaces, from digital and virtual spaces like online communities, the World Wide Web, and databases to physical spaces such as libraries, museums, archives, and other repositories. Information school degree programs include course offerings in areas such as data science, information architecture, design, economics, policy, retrieval, security, and telecommunications; knowledge management, user experience design, and usability; conservation and preservation, including digital preservation; librarianship and library administration; the sociology of information; and human–computer interaction.

    Read more →
  • Token maxxing

    Token maxxing

    Token Maxxing or Token Maxing is a metric used in an attempt to track productivity in the workplace especially for those using Artificial Intelligence (AI) based services. AI services charge for each token which represent units of effort expended by an AI service to solve a problem. Some believe that token consumption equates to productivity and thus can be used as a metric to monitor an employee's work. Supporters believe that higher token usage indicates higher productivity and higher utilization of powerful AI services. This also suggests that those not consuming enough tokens may be less productive and underutilizing powerful AI services. This belief might lead to an environment that incentivizes higher token usage to predict increased productivity. Critics of token maxxing as a metric claim that prudent workers will maximize any metric that management wants increased to gain a workplace advantage. For example: Engineers in the tech industries pressed to consume as many tokens as possible might run several AI agents in tandem, enter longer input prompts, or automate their tasks to maximize their token consumption. To management, this higher token usage may indicate potential productivity, but in reality may cause additional token costs, worker burnout, or actually create more bloated code of lower quality. Another claim is AI service companies potentially benefit from such an emphasis on token consumption and actively encourage the trend. Some developers have publicly advocated the practice. Developer Sigrid Jin, who said he used 50 billion tokens in a single year, has argued that maximizing token consumption is the best way to understand the value of AI, advising others to spend as much on AI usage as they pay in rent to obtain a return on investment. == See Also == Goodhart's law Perverse incentive Jevons Paradox

    Read more →
  • Dependency network (graphical model)

    Dependency network (graphical model)

    Dependency networks (DNs) are graphical models, similar to Markov networks, wherein each vertex (node) corresponds to a random variable and each edge captures dependencies among variables. Unlike Bayesian networks, DNs may contain cycles. Each node is associated to a conditional probability table, which determines the realization of the random variable given its parents. == Markov blanket == In a Bayesian network, the Markov blanket of a node is the set of parents and children of that node, together with the children's parents. The values of the parents and children of a node evidently give information about that node. However, its children's parents also have to be included in the Markov blanket, because they can be used to explain away the node in question. In a Markov random field, the Markov blanket for a node is simply its adjacent (or neighboring) nodes. In a dependency network, the Markov blanket for a node is simply the set of its parents. == Dependency network versus Bayesian networks == Dependency networks have advantages and disadvantages with respect to Bayesian networks. In particular, they are easier to parameterize from data, as there are efficient algorithms for learning both the structure and probabilities of a dependency network from data. Such algorithms are not available for Bayesian networks, for which the problem of determining the optimal structure is NP-hard. Nonetheless, a dependency network may be more difficult to construct using a knowledge-based approach driven by expert-knowledge. == Dependency networks versus Markov networks == Consistent dependency networks and Markov networks have the same representational power. Nonetheless, it is possible to construct non-consistent dependency networks, i.e., dependency networks for which there is no compatible valid joint probability distribution. Markov networks, in contrast, are always consistent. == Definition == A consistent dependency network for a set of random variables X = ( X 1 , … , X n ) {\textstyle \mathbf {X} =(X_{1},\ldots ,X_{n})} with joint distribution p ( x ) {\displaystyle p(\mathbf {x} )} is a pair ( G , P ) {\displaystyle (G,P)} where G {\displaystyle G} is a cyclic directed graph, where each of its nodes corresponds to a variable in X {\displaystyle \mathbf {X} } , and P {\displaystyle P} is a set of conditional probability distributions. The parents of node X i {\displaystyle X_{i}} , denoted P a i {\displaystyle \mathbf {Pa_{i}} } , correspond to those variables P a i ⊆ ( X 1 , … , X i − 1 , X i + 1 , … , X n ) {\displaystyle \mathbf {Pa_{i}} \subseteq (X_{1},\ldots ,X_{i-1},X_{i+1},\ldots ,X_{n})} that satisfy the following independence relationships p ( x i ∣ p a i ) = p ( x i ∣ x 1 , … , x i − 1 , x i + 1 , … , x n ) = p ( x i ∣ x − x i ) . {\displaystyle p(x_{i}\mid \mathbf {pa_{i}} )=p(x_{i}\mid x_{1},\ldots ,x_{i-1},x_{i+1},\ldots ,x_{n})=p(x_{i}\mid \mathbf {x} -{x_{i}}).} The dependency network is consistent in the sense that each local distribution can be obtained from the joint distribution p ( x ) {\displaystyle p(\mathbf {x} )} . Dependency networks learned using large data sets with large sample sizes will almost always be consistent. A non-consistent network is a network for which there is no joint probability distribution compatible with the pair ( G , P ) {\displaystyle (G,P)} . In that case, there is no joint probability distribution that satisfies the independence relationships subsumed by that pair. == Structure and parameters learning == Two important tasks in a dependency network are to learn its structure and probabilities from data. Essentially, the learning algorithm consists of independently performing a probabilistic regression or classification for each variable in the domain. It comes from observation that the local distribution for variable X i {\displaystyle X_{i}} in a dependency network is the conditional distribution p ( x i | x − x i ) {\displaystyle p(x_{i}|\mathbf {x} -{x_{i}})} , which can be estimated by any number of classification or regression techniques, such as methods using a probabilistic decision tree, a neural network or a probabilistic support-vector machine. Hence, for each variable X i {\displaystyle X_{i}} in domain X {\displaystyle X} , we independently estimate its local distribution from data using a classification algorithm, even though it is a distinct method for each variable. Here, we will briefly show how probabilistic decision trees are used to estimate the local distributions. For each variable X i {\displaystyle X_{i}} in X {\displaystyle \mathbf {X} } , a probabilistic decision tree is learned where X i {\displaystyle X_{i}} is the target variable and X − X i {\displaystyle \mathbf {X} -X_{i}} are the input variables. To learn a decision tree structure for X i {\displaystyle X_{i}} , the search algorithm begins with a singleton root node without children. Then, each leaf node in the tree is replaced with a binary split on some variable X j {\displaystyle X_{j}} in X − X i {\displaystyle \mathbf {X} -X_{i}} , until no more replacements increase the score of the tree. == Probabilistic Inference == A probabilistic inference is the task in which we wish to answer probabilistic queries of the form p ( y ∣ z ) {\displaystyle p(\mathbf {y\mid z} )} , given a graphical model for X {\displaystyle \mathbf {X} } , where Y {\displaystyle \mathbf {Y} } (the 'target' variables) Z {\displaystyle \mathbf {Z} } (the 'input' variables) are disjoint subsets of X {\displaystyle \mathbf {X} } . One of the alternatives for performing probabilistic inference is using Gibbs sampling. A naive approach for this uses an ordered Gibbs sampler, an important difficulty of which is that if either p ( y ∣ z ) {\displaystyle p(\mathbf {y\mid z} )} or p ( z ) {\displaystyle p(\mathbf {z} )} is small, then many iterations are required for an accurate probability estimate. Another approach for estimating p ( y ∣ z ) {\displaystyle p(\mathbf {y\mid z} )} when p ( z ) {\displaystyle p(\mathbf {z} )} is small is to use modified ordered Gibbs sampler, where Z = z {\displaystyle \mathbf {Z=z} } is fixed during Gibbs sampling. It may also happen that y {\displaystyle \mathbf {y} } is rare, e.g. when Y {\displaystyle \mathbf {Y} } has many variables. So, the law of total probability along with the independencies encoded in a dependency network can be used to decompose the inference task into a set of inference tasks on single variables. This approach comes with the advantage that some terms may be obtained by direct lookup, thereby avoiding some Gibbs sampling. You can see below an algorithm that can be used for obtain p ( y | z ) {\displaystyle p(\mathbf {y|z} )} for a particular instance of y ∈ Y {\displaystyle \mathbf {y} \in \mathbf {Y} } and z ∈ Z {\displaystyle \mathbf {z} \in \mathbf {Z} } , where Y {\displaystyle \mathbf {Y} } and Z {\displaystyle \mathbf {Z} } are disjoint subsets. Algorithm 1: U := Y {\displaystyle \mathbf {U:=Y} } ( the unprocessed variables ) P := Z {\displaystyle \mathbf {P:=Z} } ( the processed and conditioning variables ) p := z {\displaystyle \mathbf {p:=z} } ( the values for P {\displaystyle \mathbf {P} } ) While U ≠ ∅ {\displaystyle \mathbf {U} \neq \emptyset } : Choose X i ∈ U {\displaystyle X_{i}\in \mathbf {U} } such that X i {\displaystyle X_{i}} has no more parents in U {\displaystyle U} than any variable in U {\displaystyle U} If all the parents of X {\displaystyle X} are in P {\displaystyle \mathbf {P} } p ( x i | p ) := p ( x i | p a i ) {\displaystyle p(x_{i}|\mathbf {p} ):=p(x_{i}|\mathbf {pa_{i}} )} Else Use a modified ordered Gibbs sampler to determine p ( x i | p ) {\displaystyle p(x_{i}|\mathbf {p} )} U := U − X i {\displaystyle \mathbf {U:=U} -X_{i}} P := P + X i {\displaystyle \mathbf {P:=P} +X_{i}} p := p + x i {\displaystyle \mathbf {p:=p} +x_{i}} Returns the product of the conditionals p ( x i | p ) {\displaystyle p(x_{i}|\mathbf {p} )} == Applications == In addition to the applications to probabilistic inference, the following applications are in the category of Collaborative Filtering (CF), which is the task of predicting preferences. Dependency networks are a natural model class on which to base CF predictions, once an algorithm for this task only needs estimation of p ( x i = 1 | x − x i = 0 ) {\displaystyle p(x_{i}=1|\mathbf {x} -{x_{i}}=0)} to produce recommendations. In particular, these estimates may be obtained by a direct lookup in a dependency network. Predicting what movies a person will like based on his or her ratings of movies seen; Predicting what web pages a person will access based on his or her history on the site; Predicting what news stories a person is interested in based on other stories he or she read; Predicting what product a person will buy based on products he or she has already purchased and/or dropped into his or her shopping basket. Another class of useful applications for dependency networks is related to data visualization, that is

    Read more →
  • ARMA International

    ARMA International

    ARMA International (formerly the Association of Records Managers and Administrators) is an American not-for-profit professional association for information professionals – primarily information management (including records management) and information governance, and related industry practitioners and vendors. The association provides educational opportunities and publications covering aspects of information management broadly. == History == The Association was founded in 1955. In 1975, the Association of Records Executives and Administrators (AREA) and the American Records Management Association merged to form ARMA International. The headquarters for ARMA International is located in Overland Park, Kansas. == Operations == ARMA International services professionals in the United States, Canada, Japan, and the United Kingdom. Its members include records managers, attorneys, information technology professionals, consultants, and archivists involved in various aspects of managing records and information assets. ARMA hosts an annual conference with the goal of bringing together record and information management professionals from around the world – In 2023, ARMA hosted conferences in both the United States and Canada. Topics addressed in the 120+ educational sessions include advanced technology, creating information structure, ediscovery and information law, information management fundamentals, information project management, and reducing organizational information risk. The expo features exhibitors displaying records and information technologies, products, and services.

    Read more →