AI Generator Detector

AI Generator Detector — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Gemini Enterprise Agent Platform

    Gemini Enterprise Agent Platform

    Gemini Enterprise Agent Platform (formerly known as Vertex AI) is a managed machine learning (ML) and artificial intelligence (AI) platform developed by Google Cloud. It provides a unified environment for building, training, deploying, and scaling ML models and generative AI applications. The platform integrates tools for the full ML lifecycle, including data preparation, model training, evaluation, deployment, and monitoring, under a single API and user interface. Vertex AI was announced at Google I/O and released as a generally available product on May 18, 2021. At launch, Google described Vertex AI as unifying its AutoML offerings with its prior Cloud AI Platform capabilities, and as adding operational features intended to help teams move models from experimentation into production use. On April 22, 2026, Google announced Gemini Enterprise Agent Platform as the replacement evolution of Vertex AI. == History == Google Cloud announced the general availability of Vertex AI on May 18, 2021, at the Google I/O developer conference. The platform was designed to consolidate Google Cloud's previously separate ML offerings, including AutoML and the legacy AI Platform, into a single system. At launch, Google claimed that Vertex AI required roughly 80% fewer lines of code to train a model compared to competing platforms. In June 2023, Google made generative AI support in Vertex AI generally available, giving developers access to foundation models including PaLM 2, Imagen, and Codey through the platform's Model Garden and the newly launched Generative AI Studio. At the time of this launch, Model Garden included over 60 models from Google and its partners. In August 2023, at the Google Cloud Next conference, Google announced further updates to Vertex AI, including the addition of third-party models such as Claude 2 from Anthropic and Llama 2 from Meta to the Model Garden, as well as new tools called Vertex AI Extensions for connecting models to APIs for real-time data retrieval. At the same event, Vertex AI Search and Conversation were made generally available, providing enterprise search and chatbot capabilities powered by foundation models. In April 2024, at Google Cloud Next, the company introduced Vertex AI Agent Builder, a no-code tool for creating AI-powered conversational agents built on top of Gemini large language models. This brought together the existing Vertex AI Search and Conversation products with new developer tools for building generative AI experiences. == Features == === Model training === Vertex AI supports both AutoML, which enables code-free model training on tabular, image, text, or video data, and custom training, which gives users full control over the ML framework, training code, and hyperparameter tuning. The platform provides serverless training as well as dedicated training clusters with GPU and TPU accelerators. Vertex AI Vizier handles automatic hyperparameter tuning, and Vertex AI Experiments allows comparison and tracking of training runs. === Model Garden === The Vertex AI Model Garden is a curated catalog of over 200 enterprise-ready models, including Google's own foundation models (such as Gemini, Imagen, and Veo), third-party models (such as Anthropic's Claude and Mistral AI models), and popular open-source models (such as Llama and Gemma). Models are accessible as fully managed model-as-a-service APIs. === Pipelines (workflow orchestration) === Vertex AI Pipelines provides managed orchestration of ML workflows and supports pipelines built with the Kubeflow Pipelines SDK, among other options described in Google Cloud documentation. === Vertex AI Studio === Vertex AI Studio provides tools for prompt design, testing, and model management, allowing developers to prototype and build generative AI applications using natural language, code, images, or video. === Agent Builder and Agent Engine === Vertex AI Agent Builder is a suite of products for building, deploying, and governing AI agents in production environments. It supports development with the open-source Agent Development Kit (ADK) and other frameworks. Vertex AI Agent Engine provides the underlying infrastructure for deploying and scaling agents, with support for enterprise security features including HIPAA compliance, customer-managed encryption keys (CMEK), and VPC Service Controls. === Generative AI tooling and model access === Google markets Vertex AI as providing access to Google foundation models (including the Gemini family) and developer tools such as Vertex AI Studio, along with a model catalog that includes Google and selected open source models (marketed as "Model Garden"). Google has also offered products within Vertex AI aimed at building generative search and conversational applications, including offerings named "Vertex AI Search" and "Vertex AI Conversation" as reported in 2023 coverage of platform updates. === MLOps tools === The platform includes a range of MLOps capabilities: Vertex AI Pipelines for orchestrating and automating ML workflows as reusable pipelines. Vertex AI Feature Store for serving, sharing, and reusing ML features across projects. Vertex AI Model Registry for storing, versioning, and managing trained models. Vertex AI Model Monitoring for detecting training-serving skew and inference drift in deployed models. Vertex Explainable AI for interpreting model predictions. Vertex AI Workbench for managed JupyterLab notebook environments integrated with Google Cloud Storage and BigQuery. == Industry recognition == Google was named a Leader for the fifth consecutive year in the 2024 Gartner Magic Quadrant for Cloud AI Developer Services, a recognition that encompasses Vertex AI and its related offerings. Google was also recognized as a Leader in the 2024 Gartner Magic Quadrant for Data Science and Machine Learning Platforms and was named a Leader in the Forrester Wave for AI/ML Platforms, Q3 2024. In October 2025, Google was also named a Leader in the 2025 IDC (International Data Corporation) MarketScape for Worldwide GenAI Life-Cycle Foundation Model Software. == Pricing == Vertex AI uses a pay-as-you-go pricing model, with costs determined by the specific services consumed, including model training, prediction serving, and data storage. For generative AI tasks, pricing is based on a per-token model, with rates varying depending on the specific model used and whether tokens are input or output. Google offers a free tier for new users, which includes limited custom training hours and online prediction usage, along with an introductory US$300 in Google Cloud credits valid for 90 days. == Adoption == In the year following its 2021 launch, Google reported that usage of Vertex AI and BigQuery had driven 2.5 times more machine learning predictions compared to the prior year, and that active customers of Vertex AI Workbench had grown 25-fold over a six-month period. Early enterprise adopters included Ford, Wayfair, and Seagate, among others. Wayfair reported that it was able to run large model training jobs 5 to 10 times faster using the platform.

    Read more →
  • Reference data

    Reference data

    Reference data is data used to classify or categorize other data. Typically, they are static or slowly changing over time. Examples of reference data include: Units of measurement Country codes Corporate codes Fixed conversion rates e.g., weight, temperature, and length Calendar structure and constraints Reference data sets are sometimes alternatively referred to as a "controlled vocabulary" or "lookup" data. Reference data differs from master data. While both provide context for business transactions, reference data is concerned with classification and categorisation, while master data is concerned with business entities. A further difference between reference data and master data is that a change to the reference data values may require an associated change in business process to support the change, while a change in master data will always be managed as part of existing business processes. For example, adding a new customer or sales product is part of the standard business process. However, adding a new product classification (e.g. "restricted sales item") or a new customer type (e.g. "gold level customer") will result in a modification to the business processes to manage those items. == Externally-defined reference data == For most organisations, most or all reference data is defined and managed within that organisation. Some reference data, however, may be externally defined and managed, for example by standards organizations. An example of externally defined reference data is the set of country codes as defined in ISO 3166-1. == Reference data management == Curating and managing reference data is key to ensuring its quality and thus fitness for purpose. All aspects of an organisation, operational and analytical, are greatly dependent on the quality of an organization's reference data. Without consistency across business process or applications, for example, similar things may be described in quite different ways. Reference data gain in value when they are widely re-used and widely referenced. Examples of good practice in reference data management include: Formalize the reference data management Use external reference data as much as possible Govern the reference data specific to your enterprise Manage reference data at enterprise level Version control your reference data

    Read more →
  • MarkLogic Server

    MarkLogic Server

    MarkLogic Server is a document-oriented database developed by MarkLogic. It is a NoSQL multi-model database that evolved from an XML database to natively store JSON documents and RDF triples, the data model for semantics. MarkLogic is designed to be a data hub for operational and analytical data. == History == MarkLogic Server was built to address shortcomings with existing search and data products. The product first focused on using XML as the document markup standard and XQuery as the query standard for accessing collections of documents up to hundreds of terabytes in size. Currently the MarkLogic platform is widely used in publishing, government, finance and other sectors. MarkLogic's customers are mostly Global 2000 companies. == Technology == MarkLogic uses documents without upfront schemas to maintain a flexible data model. In addition to having a flexible data model, MarkLogic uses a distributed, scale-out architecture that can handle hundreds of billions of documents and hundreds of terabytes of data. It has received Common Criteria certification, and has high availability and disaster recovery. MarkLogic is designed to run on-premises and within public or private cloud environments like Amazon Web Services. == Features == Indexing MarkLogic indexes the content and structure of documents including words, phrases, relationships, and values in over 200 languages with tokenization, collation, and stemming for core languages. Functionality includes the ability to toggle range indexes, geospatial indexes, the RDF triple index, and reverse indexes on or off based on your data, the kinds of queries that you will run, and your desired performance. Full-text search MarkLogic supports search across its data and metadata using a word or phrase and incorporates Boolean logic, stemming, wildcards, case sensitivity, punctuation sensitivity, diacritic sensitivity, and search term weighting. Data can be searched using JavaScript, XQuery, SPARQL, and SQL. Semantics MarkLogic uses RDF triples to provide semantics for ease of storing metadata and querying. ACID Unlike other NoSQL databases, MarkLogic maintains ACID consistency for transactions. Replication MarkLogic provides high availability with replica sets. Scalability MarkLogic scales horizontally using sharding. MarkLogic can run over multiple servers, balancing the load or replicating data to keep the system up and running in the event of hardware failure. Security MarkLogic has built in security features such as element-level permissions and data redaction. Optic API for Relational Operations An API that lets developers view their data as documents, graphs or rows. Security MarkLogic provides redaction, encryption, and element-level security (allowing for control on read and write rights on parts of a document). == Applications == Banking Big Data Fraud prevention Insurance Claims Management and Underwriting Master data management Recommendation engines == Licensing == MarkLogic is available under various licensing and delivery models, namely a free Developer or an Essential Enterprise license.[3] Licenses are available from MarkLogic or directly from cloud marketplaces such as Amazon Web Services and Microsoft Azure. == Releases == 2001 – Cerisent XQE 1: ACID transactions, Full-text search, XML Storage, XQuery, Role-based security 2004 – Cerisent XQE 2: Scale-out architecture, Enhanced search (stemming, thesaurus, wildcard), Backup and restore 2005 – MarkLogic Server 3: Continuing search improvements, Content Processing Framework (including PDF, Word, Excel, PPT), Failover 2008 – MarkLogic Server 4: Geospatial search, entity extraction, advanced XQuery, performance, scalability enhancements, auditing 2011 – MarkLogic Server 5: Flexible replication / DDIL, real-time indexing, advanced search, improved analytics, concurrency enhancements 2012 – MarkLogic Server 6: REST and Java APIs, App Builder, enhanced UI, improved search 2013 – MarkLogic Server 7: Semantic graph, bitemporal data, tiered storage, improved search, better management 2015 – MarkLogic Server 8: A Native JSON storage, Server-side JavaScript, Bitemporal, Node.js client API, Incremental backup, Flexible replication[16] 2017 – MarkLogic Server 9: Data integration across Relational and Non-Relational data, Advanced Encryption, Element Level Security, Redaction 2019 – MarkLogic Server 10: Enhanced Data Hub, improved SQL, security, analytics performance, cloud support 2022 – MarkLogic Server 11: MarkLogic Ops Director (Monitoring and Administration Improvements), expanded PKI 2025 – MarkLogic Server 12: Generative AI and Native Vector Search, Graph Algorithm Support, Virtual TDEs (relational views on the fly)

    Read more →
  • Project workforce management

    Project workforce management

    Project workforce management is the practice of combining the coordination of all logistic elements of a project through a single software application (or workflow engine). This includes planning and tracking of schedules and mileposts, cost and revenue, resource allocation, as well as overall management of these project elements. Efficiency is improved by eliminating manual processes, like spreadsheet tracking to monitor project progress. It also allows for at-a-glance status updates and ideally integrates with existing legacy applications in order to unify ongoing projects, enterprise resource planning (ERP) and broader organizational goals. There are a lot of logistic elements in a project. Different team members are responsible for managing each element and often, the organisation may have a mechanism to manage some logistic areas as well. By coordinating these various components of project management, workforce management and financials through a single solution, the process of configuring and changing project and workforce details is simplified. == Introduction == A project workforce management system defines project tasks, project positions, and assigns personnel to the project positions. The project tasks and positions are correlated to assign a responsible project position or even multiple positions to complete each project task. Because each project position may be assigned to a specific person, the qualifications and availabilities of that person can be taken into account when determining the assignment. By associating project tasks and project positions, a manager can better control the assignment of the workforce and complete the project more efficiently. When it comes to project workforce management, it is all about managing all the logistic aspects of a project or an organisation through a software application. Usually, this software has a workflow engine defined. Therefore, all the logistic processes take place in the workflow engine. == About == === Technical field === This invention relates to project management systems and methods, more particularly to a software-based system and method for project and workforce management. === Software usage === Due to the software usage, all the project workflow management tasks can be fully automated without leaving many tasks for the project managers. This returns high efficiency to the project management when it comes to project tracking proposes. In addition to different tracking mechanisms, project workforce management software also offer a dashboard for the project team. Through the dashboard, the project team has a glance view of the overall progress of the project elements. Most of the times, project workforce management software can work with the existing legacy software systems such as ERP (enterprise resource planning) systems. This easy integration allows the organisation to use a combination of software systems for management purposes. === Background === Good project management is an important factor for the success of a project. A project may be thought of as a collection of activities and tasks designed to achieve a specific goal of the organisation, with specific performance or quality requirements while meeting any subject time and cost constraints. Project management refers to managing the activities that lead to the successful completion of a project. Furthermore, it focuses on finite deadlines and objectives. A number of tools may be used to assist with this as well as with assessment. Project management may be used when planning personnel resources and capabilities. The project may be linked to the objects in a professional services life cycle and may accompany the objects from the opportunity over quotation, contract, time and expense recording, billing, period-end-activities to the final reporting. Naturally the project gets even more detailed when moving through this cycle. For any given project, several project tasks should be defined. Project tasks describe the activities and phases that have to be performed in the project such as writing of layouts, customising, testing. What is needed is a system that allows project positions to be correlated with project tasks. Project positions describe project roles like project manager, consultant, tester, etc. Project-positions are typically arranged linearly within the project. By correlating project tasks with project positions, the qualifications and availability of personnel assigned to the project positions may be considered. == Benefits of project management == Good project management should: Reduce the chance of a project failing Ensure a minimum level of quality and that results meet requirements and expectations Free up other staff members to get on with their area of work and increase efficiency both on the project and within the business Make things simpler and easier for staff with a single point of contact running the overall project Encourage consistent communications amongst staff and suppliers Keep costs, timeframes and resources to budget == Workflow engine == When it comes to project workforce management, it is all about managing all the logistic aspects of a project or an organisation through a software application. Usually, this software has a workflow engine defined in them. So, all the logistic processes take place in the workflow engine. The regular and most common types of tasks handled by project workforce management software or a similar workflow engine are: === Planning and monitoring project schedules and milestones === Regularly monitoring your project's schedule performance can provide early indications of possible activity-coordination problems, resource conflicts, and possible cost overruns. To monitor schedule performance. Collecting information and evaluating it ensure a project accuracy. The project schedule outlines the intended result of the project and what's required to bring it to completion. In the schedule, we need to include all the resources involved and cost and time constraints through a work breakdown structure (WBS). The WBS outlines all the tasks and breaks them down into specific deliverables. === Tracking the cost and revenue aspects of projects === The importance of tracking actual costs and resource usage in projects depends upon the project situation. Tracking actual costs and resource usage is an essential aspect of the project control function. === Resource utilisation and monitoring === Organisational profitability is directly connected to project management efficiency and optimal resource utilisation. To sum up, organisations that struggle with either or both of these core competencies typically experience cost overruns, schedule delays and unhappy customers. The focus for project management is the analysis of project performance to determine whether a change is needed in the plan for the remaining project activities to achieve the project goals. == Other management aspects of project management == === Project risk management === Risk identification consists of determining which risks are likely to affect the project and documenting the characteristics of each. === Project communication management === Project communication management is about how communication is carried out during the course of the project === Project quality management === It is of no use completing a project within the set time and budget if the final product is of poor quality. The project manager has to ensure that the final product meets the quality expectations of the stakeholders. This is done by good: Quality planning: Identifying what quality standards are relevant to the project and determining how to meet them. Quality assurance: Evaluating overall project performance on a regular basis to provide confidence that the project will satisfy the relevant quality standards. Quality control: Monitoring specific project results to determine if they comply with relevant quality standards and identifying ways to remove causes of poor performance. == Project workforce management vs. traditional management == There are three main differences between Project Workforce Management and traditional project management and workforce management disciplines and solutions: === Workflow-driven === All project and workforce processes are designed, controlled and audited using a built-in graphical workflow engine. Users can design, control and audit the different processes involved in the project. The graphical workflow is quite attractive for the users of the system and allows the users to have a clear idea of the workflow engine. === Organisation and work breakdown structures === Project Workforce Management provides organization and work breakdown structures to create, manage and report on functional and approval hierarchies, and to track information at any level of detail. Users can create, manage, edit and report work breakdown structures. Work breakdown structures have different abstraction

    Read more →
  • Enterprise cognitive system

    Enterprise cognitive system

    Enterprise cognitive systems (ECS) are part of a broader shift in computing, from a programmatic to a probabilistic approach, called cognitive computing. An Enterprise Cognitive System makes a new class of complex decision support problems computable, where the business context is ambiguous, multi-faceted, and fast-evolving, and what to do in such a situation is usually assessed today by the business user. An ECS is designed to synthesize a business context and link it to the desired outcome. It recommends evidence-based actions to help the end-user achieve the desired outcome. It does so by finding past situations similar to the current situation, and extracting the repeated actions that best influence the desired outcome. While general-purpose cognitive systems can be used for different outputs, prescriptive, suggestive, instructive, or simply entertaining, an enterprise cognitive system is focused on action, not insight, to help in assessing what to do in a complex situation. == Key characteristics == ECS have to be: Adaptive: They must learn as information changes, and as goals and requirements evolve. They must resolve ambiguity and tolerate unpredictability. They must be engineered to feed on dynamic data in real time, or near real time. In the Enterprise, near-real time learning from data requires an agile information federation approach to ingest incremental data updates as they occur, and an unsupervised learning approach to ensure that new best practice is leveraged across the organization in a timely manner. Interactive: They must interact easily with users so that those users can define their needs comfortably. They may also interact with other processors, devices, and Cloud services, as well as with people. In the Enterprise, interactions are controlled via existing workflows and UIs. Therefore, embedding best practices directly into these existing interfaces, in the context of a specific step, is critical to ensure maximum end-user adoption. Iterative and stateful: They must aid in defining a problem by asking questions or finding additional source input if a problem statement is ambiguous or incomplete. They must “remember” previous interactions in a process and return information that is suitable for the specific application at that point in time. In the Enterprise, business context is often structured by a business process, and therefore sufficiently data-rich to make relevant recommendations without significant iterations from the end-user. A stateful memory of overall interactions across communication channels is critical for understanding of context, as a static profile will not capture intent and outcome potential the way behavior does. Contextual: They must understand, identify, and extract contextual elements such as meaning, syntax, time, location, appropriate domain, regulations, user's profile, process, task and goal. They may draw on multiple sources of information, including both structured and unstructured digital information, as well as sensory inputs (visual, gestural, auditory, or sensor-provided). In the Enterprise, Context is fragmented and must be aggregated across data types, sources, and locations. In most business environments, such data is captured in existing enterprise information systems, and the effort is linked to quickly source and unify such information. It is rare to have to directly process sensor, audio or visual data in real-time as direct input into the enterprise cognitive system. Instead, these data types are captured by Enterprise Applications and pre-processed into a binary or text format prior to consumption by the System. == Business applications powered by an ECS == Bottlenose – trends and brands monitoring Cybereason – security threat monitoring Dataminr – social media monitoring

    Read more →
  • SIGMOD Edgar F. Codd Innovations Award

    SIGMOD Edgar F. Codd Innovations Award

    The ACM SIGMOD Edgar F. Codd Innovations Award is a lifetime research achievement award given by the ACM Special Interest Group on Management of Data, at its yearly flagship conference (also called SIGMOD). According to its homepage, it is given "for innovative and highly significant contributions of enduring value to the development, understanding, or use of database systems and databases". The award has been given since 1992. Until 2003, this award was known as the “SIGMOD Innovations Award.” In 2004, SIGMOD, with the unanimous approval of ACM Council, decided to rename the award to honor Dr. E.F. (Ted) Codd (1923 – 2003) who invented the relational data model and was responsible for the significant development of the database field as a scientific discipline. == Recipients ==

    Read more →
  • EdgeRank

    EdgeRank

    EdgeRank is the name commonly given to the algorithm that Facebook uses to determine what articles should be displayed in a user's News Feed. As of 2011, Facebook has stopped using the EdgeRank system and uses a machine learning algorithm that, as of 2013, takes more than 100,000 factors into account. EdgeRank was developed and implemented by Serkan Piantino. == Formula and factors == In 2010, a simplified version of the EdgeRank algorithm was presented as: ∑ e d g e s e u e w e d e {\displaystyle \sum _{\mathrm {edges\,} e}u_{e}w_{e}d_{e}} where: u e {\displaystyle u_{e}} is user affinity. w e {\displaystyle w_{e}} is how the content is weighted. d e {\displaystyle d_{e}} is a time-based decay parameter. User Affinity: The User Affinity part of the algorithm in Facebook's EdgeRank looks at the relationship and proximity of the user and the content (post/status update). Content Weight: What action was taken by the user on the content. Time-Based Decay Parameter: New or old. Newer posts tend to hold a higher place than older posts. Some of the methods that Facebook uses to adjust the parameters are proprietary and not available to the public. A study has shown that it is possible to hypothesize a disadvantage of the "like" reaction and advantages of other interactions (e.g., the "haha" reaction or "comments") in content algorithmic ranking on Facebook. The "like" button can decrease the organic reach as a "brake effect of viral reach". The "haha" reaction, "comments" and the "love" reaction could achieve the highest increase in total organic reach. == Impact == EdgeRank and its successors have a broad impact on what users actually see out of what they ostensibly follow: for instance, the selection can produce a filter bubble (if users are exposed to updates which confirm their opinions etc.) or alter people's mood (if users are shown a disproportionate amount of positive or negative updates). As a result, for Facebook pages, the typical engagement rate is less than 1% (or less than 0.1% for the bigger ones), and organic reach 10% or less for most non-profits. As a consequence, for pages, it may be nearly impossible to reach any significant audience without paying to promote their content.

    Read more →
  • KiSAO

    KiSAO

    The Kinetic Simulation Algorithm Ontology (KiSAO) supplies information about existing algorithms available for the simulation of systems biology models, their characterization and interrelationships. KiSAO is part of the BioModels.net project and of the COMBINE initiative. == Structure == KiSAO consists of three main branches: simulation algorithm simulation algorithm characteristic simulation algorithm parameter The elements of each algorithm branch are linked to characteristic and parameter branches using has characteristic and has parameter relationships accordingly. The algorithm branch itself is hierarchically structured using relationships which denote that the descendant algorithms were derived from, or specify, more general ancestors.

    Read more →
  • Zero-day vulnerability

    Zero-day vulnerability

    A zero-day (also known as a 0-day) is a vulnerability or security hole in a computer system unknown to its developers or anyone capable of mitigating it. Until the vulnerability is remedied, threat actors can exploit it in a zero-day exploit, or zero-day attack. The term "zero-day" originally referred to the number of days since a new piece of software was released to the public, so "zero-day software" was obtained by hacking into a developer's computer before release. Eventually the term was applied to the vulnerabilities that allowed this hacking, and to the number of days that the vendor has had to fix them. Vendors who discover the vulnerability may create patches or advise workarounds to mitigate it, though users need to deploy that mitigation to eliminate the vulnerability in their systems. Zero-day attacks are severe threats. == Definition == Despite developers' goal of delivering a product that works entirely as intended, virtually all products contain software and hardware bugs. If a bug creates a security risk, it is called a vulnerability. Vulnerabilities vary in their ability to be exploited by malicious actors. Some are not usable at all, while others can be used to disrupt the device with a denial of service attack. The most dangerous allow the attacker to inject and run their own code, without the user being aware of it. Although the term "zero-day" initially referred to the time since the vendor had become aware of the vulnerability, zero-day vulnerabilities can also be defined as the subset of vulnerabilities for which no patch or other fix is available. A zero-day exploit is any exploit that takes advantage of such a vulnerability. == Exploits == An exploit is the delivery mechanism that takes advantage of the vulnerability to penetrate the target's systems, for such purposes as disrupting operations, installing malware, or exfiltrating data. Researchers Lillian Ablon and Andy Bogart write that "little is known about the true extent, use, benefit, and harm of zero-day exploits". Exploits based on zero-day vulnerabilities are considered more dangerous than those that take advantage of a known vulnerability. However, it is likely that most cyberattacks use known vulnerabilities, not zero-days. Governments of states are the primary users of zero-day exploits, not only because of the high cost of finding or buying vulnerabilities, but also the significant cost of writing the attack software. Nevertheless, anyone can use a vulnerability, and according to research by the RAND Corporation, "any serious attacker can always get an affordable zero-day for almost any target". Many targeted attacks and most advanced persistent threats rely on zero-day vulnerabilities. In 2017, the average time to develop an exploit from a zero-day vulnerability was estimated at 22 days. The difficulty of developing exploits has been increasing over time due to increased anti-exploitation features in popular software. === Window of vulnerability === Zero-day vulnerabilities are often classified as alive—meaning that there is no public knowledge of the vulnerability—and dead—the vulnerability has been disclosed, but not patched. If the software's maintainers are actively searching for vulnerabilities, it is a living vulnerability; such vulnerabilities in unmaintained software are called immortal. Zombie vulnerabilities can be exploited in older versions of the software but have been patched in newer versions. Even publicly known and zombie vulnerabilities are often exploitable for an extended period. Security patches can take months to develop, or may never be developed. A patch can have negative effects on the functionality of software and users may need to test the patch to confirm functionality and compatibility. Larger organizations may fail to identify and patch all dependencies, while smaller enterprises and personal users may not install patches. Research suggests that risk of cyberattack increases if the vulnerability is made publicly known or a patch is released. Cybercriminals can reverse engineer the patch to find the underlying vulnerability and develop exploits, often faster than users install the patch. According to research by RAND Corporation published in 2017, zero-day exploits remain usable for 6.9 years on average, although those purchased from a third party only remain usable for 1.4 years on average. The researchers were unable to determine if any particular platform or software (such as open-source software) had any relationship to the life expectancy of a zero-day vulnerability. Although the RAND researchers found that 5.7 percent of a stockpile of secret zero-day vulnerabilities will have been discovered by someone else within a year, another study found a higher overlap rate, as high as 10.8 percent to 21.9 percent per year. == Countermeasures == Because, by definition, there is no patch that can block a zero-day exploit, all systems employing the software or hardware with the vulnerability are at risk. This includes secure systems such as banks and governments that have all patches up to date. Security systems are designed around known vulnerabilities, and repeated exploitations of a zero-day exploit could continue undetected for an extended period of time. Although there have been many proposals for a system that is effective at detecting zero-day exploits, this remains an active area of research in 2023. Many organizations have adopted defense-in-depth tactics so that attacks are likely to require breaching multiple levels of security, which makes it more difficult to achieve. Conventional cybersecurity measures such as training and access control — including multi-factor authentication, least-privilege access, and air-gapping makes it harder to compromise systems with a zero-day exploit. Since writing perfectly secure software is impossible, some researchers argue that driving up the cost of exploits is considered a good strategy to reduce the burden of cyberattacks. == Market == Zero-day exploits can fetch millions of dollars. There are three main types of buyers: White: the vendor, or to third parties such as the Zero Day Initiative that disclose to the vendor. Often such disclosure is in exchange for a bug bounty. Not all companies respond positively to disclosures, as they can cause legal liability and operational overhead. It is not uncommon to receive cease-and-desist letters from software vendors after disclosing a vulnerability for free. Gray: the largest and most lucrative. Government or intelligence agencies buy zero-days and may use it in an attack, stockpile the vulnerability, or notify the vendor. The United States federal government is one of the largest buyers. As of 2013, the Five Eyes (United States, United Kingdom, Canada, Australia, and New Zealand) captured the plurality of the market and other significant purchasers included Russia, India, Brazil, Malaysia, Singapore, North Korea, and Iran. Middle Eastern countries were poised to become the biggest spenders. Black: organized crime, which typically prefers exploit software rather than just knowledge of a vulnerability. These users are more likely to employ "half-days" where a patch is already available. In 2015, the markets for government and crime were estimated at least ten times larger than the white market. Sellers are often hacker groups that seek out vulnerabilities in widely used software for financial reward. Some will only sell to certain buyers, while others will sell to anyone. White market sellers are more likely to be motivated by non pecuniary rewards such as recognition and intellectual challenge. Selling zero-day exploits is legal. Despite calls for more regulation, law professor Mailyn Fidler says there is little chance of an international agreement because key players such as Russia and Israel are not interested. The sellers and buyers that trade in zero-days tend to be secretive, relying on non-disclosure agreements and classified information laws to keep the exploits secret. If the vulnerability becomes known, it can be patched and its value consequently crashes. Because the market lacks transparency, it can be hard for parties to find a fair price. Sellers might not be paid if the vulnerability was disclosed before it was verified, or if the buyer declined to purchase it but used it anyway. With the proliferation of middlemen, sellers could never know to what use the exploits could be put. Buyers could not guarantee that the exploit was not sold to another party. Both buyers and sellers advertise on the dark web. Research published in 2022 based on maximum prices paid as quoted by a single exploit broker found a 44 percent annualized inflation rate in exploit pricing. Remote zero-click exploits could fetch the highest price, while those that require local access to the device are much cheaper. Vulnerabilities in widely used software are also more expensive. They estimated that around 400 to 1,500 people sold exploits to th

    Read more →
  • Pol.is

    Pol.is

    Polis (or Pol.is) is wiki survey software designed for large group collaborations. As a civic technology, Polis allows people to share their opinions and ideas, and its algorithm is intended to elevate ideas that can facilitate better decision-making, especially when there are lots of participants. Polis has been credited for assisting the passage of legislation in Taiwan. Pol.is has been used by governments in the United States, Canada, Singapore, Philippines, Finland, Spain and elsewhere. == History == Pol.is was founded by Colin Megill, Christopher Small, and Michael Bjorkegren after the Occupy Wall Street and Arab Spring movements. In Taiwan, pol.is has been "one of the key parts" of vTaiwan's suite of open-source tools for its citizen engagement efforts arising out of the Sunflower Student Movement. vTaiwan claims that of the 26 national issues related to technology discussed on the platform, 80% led to government action. Pol.is is also utilized by "Join," a national platform for online deliberation run by the Taiwanese government. In 2022, Wired reported that Polis was an influence on the Community Notes project at Twitter. In 2023, Megill advised OpenAI on how to facilitate deliberation at scale in a way that was more efficient than Polis, which still required significant human labor and analysis at the time. He helped to award $1 million in grants to teams working on solving the problem of deliberation at scale. In 2023, Anthropic was also exploring steering model behavior using Polis. In 2025, it helped the county that includes Bowling Green, Kentucky make a 25 year plan by facilitating the collection and review of ideas from thousands of residents, representing 10% of the county. 2,370 of 3,940 unique ideas were agreed-upon by over 80% of survey respondents. Ideas were screened by volunteers if they were redundant to an existing idea, off-topic or obscene. == How it works == Pol.is participants are anonymous and cannot reply directly to others posts, in an effort to avoid personal attacks for users. Its algorithms are designed not for engagement and scrolling, but to find areas of agreement to better understand the nuances of a wide range of opinions. Participants are prompted for ideas and vote on other participants' ideas. == Reception == Andrew Leonard, The Financial Times, and VentureBeat describe Pol.is as a possible antidote to the divisiveness of traditional internet discourse by gamifying consensus. Audrey Tang agreed saying, "Polis is quite well known in that it's a kind of social media that instead of polarizing people to drive so called engagement or addiction or attention, it automatically drives bridge making narratives and statements. So only the ideas that speak to both sides or to multiple sides will gain prominence in Polis." Niall Ferguson argues that the approach to utilize tools like Pol.is and Join in Taiwan empowers ordinary people instead of the elite and protects individual freedoms, providing a contrast to the AI-enhanced panopticon model seen in China. Carl Miller praised the technology as having "gamified finding consensus." Darshana Narayanan, in an op-ed in the Economist, argues that open-source machine-learning-based tools like Polis can help to bypass the influence of special interests or experts. Jamie Susskind cited polis and vTaiwan as a model for democracies, particularly around digital policy issues.

    Read more →
  • Bibliographic database

    Bibliographic database

    A bibliographic database is a database of bibliographic records. This is an organised online collection of references to published written works like journal and newspaper articles, conference proceedings, reports, government and legal publications, patents and books. In contrast to library catalogue entries, a majority of the records in bibliographic databases describe articles and conference papers rather than complete monographs, and they generally contain very rich subject descriptions in the form of keywords, subject classification terms, or abstracts. A bibliographic database may cover a wide range of topics or one academic field like computer science. A significant number of bibliographic databases are marketed under a trade name by licensing agreement from vendors, or directly from their makers: the indexing and abstracting services. Many bibliographic databases have evolved into digital libraries, providing the full text of the organised contents:for instance CORE also organises and mirrors scholarly articles and OurResearch develops a search engine for open access content in Unpaywall. Others merge with non-bibliographic and scholarly databases to create more complete disciplinary search engine systems, such as Chemical Abstracts or Entrez. == History == Prior to the mid-20th century, individuals searching for published literature had to rely on printed bibliographic indexes, generated manually from index cards. During the early 1960s computers were used to digitize text for the first time; the purpose was to reduce the cost and time required to publish two American abstracting journals, the Index Medicus of the National Library of Medicine and the Scientific and Technical Aerospace Reports of the National Aeronautics and Space Administration (NASA). By the late 1960s, such bodies of digitized alphanumeric information, known as bibliographic and numeric databases, constituted a new type of information resource. Online interactive retrieval became commercially viable in the early 1970s over private telecommunications networks. The first services offered a few databases of indexes and abstracts of scholarly literature. These databases contained bibliographic descriptions of journal articles that were searchable by keywords in author and title, and sometimes by journal name or subject heading. The user interfaces were crude, the access was expensive, and searching was done by librarians on behalf of "end users".

    Read more →
  • Birkhoff algorithm

    Birkhoff algorithm

    Birkhoff's algorithm (also called Birkhoff-von-Neumann algorithm) is an algorithm for decomposing a bistochastic matrix into a convex combination of permutation matrices. It was published by Garrett Birkhoff in 1946. It has many applications. One such application is for the problem of fair random assignment: given a randomized allocation of items, Birkhoff's algorithm can decompose it into a lottery on deterministic allocations. == Terminology == A bistochastic matrix (also called: doubly-stochastic) is a matrix in which all elements are greater than or equal to 0 and the sum of the elements in each row and column equals 1. An example is the following 3-by-3 matrix: ( 0.2 0.3 0.5 0.6 0.2 0.2 0.2 0.5 0.3 ) {\displaystyle {\begin{pmatrix}0.2&0.3&0.5\\0.6&0.2&0.2\\0.2&0.5&0.3\end{pmatrix}}} A permutation matrix is a special case of a bistochastic matrix, in which each element is either 0 or 1 (so there is exactly one "1" in each row and each column). An example is the following 3-by-3 matrix: ( 0 1 0 0 0 1 1 0 0 ) {\displaystyle {\begin{pmatrix}0&1&0\\0&0&1\\1&0&0\end{pmatrix}}} A Birkhoff decomposition (also called: Birkhoff-von-Neumann decomposition) of a bistochastic matrix is a presentation of it as a sum of permutation matrices with non-negative weights. For example, the above matrix can be presented as the following sum: 0.2 ( 0 1 0 0 0 1 1 0 0 ) + 0.2 ( 1 0 0 0 1 0 0 0 1 ) + 0.1 ( 0 1 0 1 0 0 0 0 1 ) + 0.5 ( 0 0 1 1 0 0 0 1 0 ) {\displaystyle 0.2{\begin{pmatrix}0&1&0\\0&0&1\\1&0&0\end{pmatrix}}+0.2{\begin{pmatrix}1&0&0\\0&1&0\\0&0&1\end{pmatrix}}+0.1{\begin{pmatrix}0&1&0\\1&0&0\\0&0&1\end{pmatrix}}+0.5{\begin{pmatrix}0&0&1\\1&0&0\\0&1&0\end{pmatrix}}} Birkhoff's algorithm receives as input a bistochastic matrix and returns as output a Birkhoff decomposition. == Tools == A permutation set of an n-by-n matrix X is a set of n entries of X containing exactly one entry from each row and from each column. A theorem by Dénes Kőnig says that: Every bistochastic matrix has a permutation-set in which all entries are positive.The positivity graph of an n-by-n matrix X is a bipartite graph with 2n vertices, in which the vertices on one side are n rows and the vertices on the other side are the n columns, and there is an edge between a row and a column if the entry at that row and column is positive. A permutation set with positive entries is equivalent to a perfect matching in the positivity graph. A perfect matching in a bipartite graph can be found in polynomial time, e.g. using any algorithm for maximum cardinality matching. Kőnig's theorem is equivalent to the following:The positivity graph of any bistochastic matrix admits a perfect matching.A matrix is called scaled-bistochastic if all elements are non-negative, and the sum of each row and column equals c, where c is some positive constant. In other words, it is c times a bistochastic matrix. Since the positivity graph is not affected by scaling:The positivity graph of any scaled-bistochastic matrix admits a perfect matching. == Algorithm == Birkhoff's algorithm is a greedy algorithm: it greedily finds perfect matchings and removes them from the fractional matching. It works as follows. Let i = 1. Construct the positivity graph GX of X. Find a perfect matching in GX, corresponding to a positive permutation set in X. Let z[i] > 0 be the smallest entry in the permutation set. Let P[i] be a permutation matrix with 1 in the positive permutation set. Let X := X − z[i] P[i]. If X contains nonzero elements, Let i = i + 1 and go back to step 2. Otherwise, return the sum: z[1] P[1] + ... + z[2] P[2] + ... + z[i] P[i]. The algorithm is correct because, after step 6, the sum in each row and each column drops by z[i]. Therefore, the matrix X remains scaled-bistochastic. Therefore, in step 3, a perfect matching always exists. == Run-time complexity == By the selection of z[i] in step 4, in each iteration at least one element of X becomes 0. Therefore, the algorithm must end after at most n2 steps. However, the last step must simultaneously make n elements 0, so the algorithm ends after at most n2 − n + 1 steps, which implies O ( n 2 ) {\displaystyle O(n^{2})} . In 1960, Joshnson, Dulmage and Mendelsohn showed that Birkhoff's algorithm actually ends after at most n2 − 2n + 2 steps, which is tight in general (that is, in some cases n2 − 2n + 2 permutation matrices may be required). == Application in fair division == In the fair random assignment problem, there are n objects and n people with different preferences over the objects. It is required to give an object to each person. To attain fairness, the allocation is randomized: for each (person, object) pair, a probability is calculated, such that the sum of probabilities for each person and for each object is 1. The probabilistic-serial procedure can compute the probabilities such that each agent, looking at the matrix of probabilities, prefers his row of probabilities over the rows of all other people (this property is called envy-freeness). This raises the question of how to implement this randomized allocation in practice? One cannot just randomize for each object separately, since this may result in allocations in which some people get many objects while other people get no objects. Here, Birkhoff's algorithm is useful. The matrix of probabilities, calculated by the probabilistic-serial algorithm, is bistochastic. Birkhoff's algorithm can decompose it into a convex combination of permutation matrices. Each permutation matrix represents a deterministic assignment, in which every agent receives exactly one object. The coefficient of each such matrix is interpreted as a probability; based on the calculated probabilities, it is possible to pick one assignment at random and implement it. == Extensions == The problem of computing the Birkhoff decomposition with the minimum number of terms has been shown to be NP-hard, but some heuristics for computing it are known. This theorem can be extended for the general stochastic matrix with deterministic transition matrices. Budish, Che, Kojima and Milgrom generalize Birkhoff's algorithm to non-square matrices, with some constraints on the feasible assignments. They also present a decomposition algorithm that minimizes the variance in the expected values. Vazirani generalizes Birkhoff's algorithm to non-bipartite graphs. Valls et al. showed that it is possible to obtain an ϵ {\displaystyle \epsilon } -approximate decomposition with O ( log ⁡ ( 1 / ϵ 2 ) ) {\displaystyle O(\log(1/\epsilon ^{2}))} permutations.

    Read more →
  • Query understanding

    Query understanding

    Query understanding is the process of inferring the intent of a search engine user by extracting semantic meaning from the searcher’s keywords. Query understanding methods generally take place before the search engine retrieves and ranks results. It is related to natural language processing but specifically focused on the understanding of search queries. == Methods == === Stemming and lemmatization === Many languages inflect words to reflect their role in the utterance they appear in. The variation between various forms of a word is likely to be of little importance for the relatively coarse-grained model of meaning involved in a retrieval system, and for this reason the task of conflating the various forms of a word is a potentially useful technique to increase recall of a retrieval system. Stemming algorithms, also known as stemmers, typically use a collection of simple rules to remove suffixes intended to model the language’s inflection rules. For some languages, there are simple lemmatisation methods to reduce a word in query to its lemma or root form or its stem; for others, this operation involves non-trivial string processing and may require recognizing the word's part of speech or referencing a lexical database. The effectiveness of stemming and lemmatization varies across languages. === Query Segmentation === Query segmentation is a key component of query understanding, aiming to divide a query into meaningful segments. Traditional approaches, such as the bag-of-words model, treat individual words as independent units, which can limit interpretative accuracy. For languages like Chinese, where words are not separated by spaces, segmentation is essential, as individual characters often lack standalone meaning. Even in English, the BOW model may not capture the full meaning, as certain phrases—such as "New York"—carry significance as a whole rather than as isolated terms. By identifying phrases or entities within queries, query segmentation enhances interpretation, enabling search engines to apply proximity and ordering constraints, ultimately improving search accuracy and user satisfaction. === Entity recognition === Entity recognition is the process of locating and classifying entities within a text string. Named-entity recognition specifically focuses on named entities, such as names of people, places, and organizations. In addition, entity recognition includes identifying concepts in queries that may be represented by multi-word phrases. Entity recognition systems typically use grammar-based linguistic techniques or statistical machine learning models. === Query rewriting === Query rewriting is the process of automatically reformulating a search query to more accurately capture its intent. Query expansion adds additional query terms, such as synonyms, in order to retrieve more documents and thereby increase recall. Query relaxation removes query terms to reduce the requirements for a document to match the query, thereby also increasing recall. Other forms of query rewriting, such as automatically converting consecutive query terms into phrases and restricting query terms to specific fields, aim to increase precision. === Spelling Correction === Automatic spelling correction is a critical feature of modern search engines, designed to address common spelling errors in user queries. Such errors are especially frequent as users often search for unfamiliar topics. By correcting misspelled queries, search engines enhance their understanding of user intent, thereby improving the relevance and quality of search results and overall user experience.

    Read more →
  • Maritime Informatics

    Maritime Informatics

    Maritime Informatics is a thematic topic within the broader discipline of informatics. It can be considered as both a field of study and domain of application. As an application domain, it is the outlet of innovations originating from data science and artificial intelligence; as a field of study, it is positioned between computer science and marine engineering. == Beginnings of maritime informatics == As a result of the increasing levels of digitalisation occurring in the maritime sector starting around 2010 and stimulated by the EU-endorsed MonaLisa project for sea traffic management (STM), a number of academics and shipping industry leaders recognised that the maritime transportation sector would benefit from a specific field of study and application to be known as Maritime Informatics - the use of information systems, data sharing and data analytics in the business and operations of maritime transportation. They considered that it would lead to improvements in efficiency, safety, resilience, and ecological sustainability - all of which are currently lacking for many aspects of sea transport. One of the first public airings of the concept of Maritime Informatics was a presentation delivered on 11 September 2014 in Gothenburg, Sweden. A proposal for an inaugural minitrack on Maritime Informatics was accepted for the 2015 Americas Conference on Information Systems in Puerto Rico where three papers were presented. Since then numerous publications has been brought forward captured at www.maritimeinformatics.org and in late 2020 the first reference book on Maritime Informatics was co-written by 81 expert contributors (47 practitioners and 34 researchers) from 20 countries. Most impactful authors and journals in the domain have been documented in a review paper. Dimitrios Zissis, Luca Cazzanti and Leonardo M. Millefiori are the top three authors; top journals and conferences include Ocean Engineering, Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems, Sensors, the international Conference On Engineering, Technology And Innovation, Expert Systems With Applications, IEEE Access, and Journal of Navigation. == Background == The shipping industry has several particular organisational aspects that are recognised and taken into account in maritime informatics: It is predominantly a self-organising ecosystem Many activities are undertaken as part of episodic tight coupling There is a so-called maritime stack There is increasing pressure to balance capital productivity and energy efficiency There is the potential virtuous interplay between different types of systems == Data sharing == Digital data sharing is key to the all-important, arguably fundamental, data analytics aspects of maritime informatics because it opens the way for better access to relevant and reliable data. As in land-based commerce, digital data sharing is a growing phenomenon in maritime operations - though there is a way to go. It is enabling greater transparency for all those involved in the transportation of goods and passengers, not least being the end-customer. This leads to better and more informed decision-making and planning by all those involved. The push for digitalisation and data sharing is being pursued both by governments and the commercial sector. For example, the Member States of the IMO agreed a mandatory requirement for their governments to introduce electronic information exchange between ships and ports as from 8 April 2019. Meanwhile, commercial operators, particularly in the container lines are putting systems in place for sharing data for mutual benefit in their operations. Data sharing is an important aspect of the Port Collaborative Decision Making (PortCDM) and Port Call Optimization initiatives, both of which seek to improve the coordination, synchronization and efficiency of the port call process by enabling a common and shared situational awareness among all those involved. == Standardisation == The availability and sharing of relevant digital data underpins maritime informatics and is key to more effective and efficient coordination and synchronisation in the predominantly self-organising ecosystem that is maritime transportation. For this to occur, a high priority underpinning maritime informatics is the encouragement of standardised digital data exchange and data sharing, leading, in turn, to improvements in shipping analytics. Improved availability of data will support better historical analysis, now-casting and forecasting. The International Maritime Organization (IMO) FAL Committee is taking the lead in ensuring that the common terms used in the various standards being developed or in use in the maritime sector are compatible and therefore interoperable as far as is practicable, by creating and maintaining The IMO Compendium on Facilitation and Electronic Business. The IMO Compendium consists of an IMO Data Set and IMO Reference Data Model agreed by the main organisations involved in the development of standards for the electronic exchange of information related to the FAL Convention: the World Customs Organization (WCO), the United Nations Economic Commission for Europe (UNECE) and the International Organization for Standardization (ISO). There are several other prominent international governmental and non-governmental organisations actively contributing to the ongoing standardisation and harmonisation process including the UN Electronic Data Interchange for Administration, Commerce and Transport (UN EDIFACT), the Digital Container Shipping Association (DCSA), the International Harbour Masters Association (IHMA) and BIMCO - the world's largest direct-membership organisation for shipowners, charterers, shipbrokers and agents.

    Read more →
  • Retention period

    Retention period

    A retention period (associated with a retention schedule or retention program) is an aspect of records and information management (RIM) and the records life cycle that identifies the duration of time for which the information should be maintained or "retained", irrespective of format (paper, electronic, or other). Retention periods vary with different types of information, based on content and a variety of other factors, including internal organizational need, regulatory requirements for inspection or audit, legal statutes of limitation, involvement in litigation, and taxation and financial reporting needs, as well as other factors as defined by local, regional, state, national, and/or international governing entities. Once an applicable retention period has elapsed for a given type or series of information, and all holds/moratoriums have been released, the information is typically destroyed using an approved and effective destruction method, which renders the information completely and irreversibly unusable via any means. Alternatively, it may be converted from one form to another (e.g. from paper to electronic), depending on the defined retention period per format. Information with historical value beyond its "usable value" may be accessioned to the custody of an archive organization for permanent or extended long-term preservation. == Defensible disposition == Defensible disposition refers to the ability of an identified and applied retention period to effectively provide for the defense of the record, and its eventual destruction or accessioning when scrutinized within a court of law or by other review. It is commonly advised by records and information management (RIM) professionals that any and all retention periods applied to organizational information should be reviewed and approved for use by competent legal counsel, which represents the organization, and is familiar with the specific business needs and legal and regulatory requirements of the organization. Additionally, a practical approach to information assessment/classification, proper documentation of the disposition program, strategic review of disposition policy over time for efficacy are required for proper defensible disposition. == Guidance and education organizations == ARMA International Information and Records Management Society filerskeepers records retention FAQ

    Read more →