AI Cv Review

AI Cv Review — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Spotify Live

    Spotify Live

    Spotify Live, formerly Spotify Greenroom, was a social audio app by Spotify, that allowed users to host or participate in live-audio virtual environments called "room" for conversations. Each room had a maximum capacity of 1000 people. The app was available on Android and iOS, competing with Twitter Spaces and Clubhouse in the social media segment. It was shut down on April 30, 2023. == History == In October 2020, Betty Labs released Locker Room exclusively on the iOS App Store. The app featured virtual audio chat rooms for sports enthusiasts. In late March 2021, Spotify acquired Betty Labs for $50 million and announced plans to rebrand the app with a broader focus on sports, music, and pop culture. On June 16, 2021, Spotify launched the app as Spotify Greenroom on Android (early access) and iOS, expanding its scope beyond just sports. At launch, Spotify introduced the Greenroom Creator Fund to support creators and shows, serving as a rival to Clubhouse's Creator First Accelerator Program. The fund aimed to provide a monetization path for podcasters integrating Greenroom into their verified Spotify accounts. By July 2021, the app had accumulated over 140,000 iOS installs and 100,000 Android installs. In August 2021, Spotify collaborated with the WWE to produce professional wrestling-related podcasts, many of which would be recorded by The Ringer, Spotify's in-house podcasting team, using Greenroom. In March 2022, Spotify Greenroom announced its rebranding as Spotify Live and its migration to the main Spotify app. After a year, Spotify announced it would shut down the Spotify Live app at the end of April 2023. == Features == Greenroom allowed users to create or join a room, which, in the context of the application, was a virtual space for real-time voice chats. Users could only create a room within a pre-defined group, representing either a brand or a generic category. If a user chose to create a room, they became the host, with the ability to invite people, control who could talk, and enable features like recording and the Discussions tab during room creation. Enabling recording displayed a disclaimer informing users that the conversation was being recorded, and the audio, recorded in mp4 format, would be sent to the host via email after the room concluded. If the Discussions tab was enabled, users could send text messages in the public chat section. The host also had the authority to ban users if necessary. When joining a room, a user could opt to be a listener or request to become a speaker. Users had the freedom to follow or block others and join groups at their discretion. Notifications about new rooms in joined groups would be sent to users. Additionally, users could discover new individuals and groups using the search tab. == Partnered creators == By October 2021, Spotify had a variety of partnered creators aimed at boosting traffic and validating its vertically integrated podcast model. These creators primarily focused on Generation Z. In-house Spotify talent, such as The Ringer, produced sports-related content. Simultaneously, the company recruited creators from various social channels to grow Greenroom's audience while also promoting its integration with Spotify and Anchor. Each verified Spotify partner had their Greenroom shows featured in both the Greenroom app and their profiles on the Spotify app. This was part of the company's strategy leading into the 2022 ramp-up to compete with Clubhouse. == Platforms == The app was accessible on both Android and iOS platforms, and users could download the app from their respective app stores. Android users needed Android 8 or above to launch the app, while iOS consumers required iOS 13 or later to run it.

    Read more →
  • E-Science librarianship

    E-Science librarianship

    E-Science librarianship refers to a role for librarians in e-Science. == Early scholars == Early references to e-Science and librarianship involve information studies scholars researching cyberinfrastructure and emerging networked information and knowledge communities. Notably Christine Borgman, Professor and Presidential Chair in Information Studies at the University of California, Los Angeles (UCLA) was a key player in bringing e-Science, and the idea of networked knowledge communities, to the attention of the library profession. In 2004, as a visiting fellow at the Oxford Internet Institute, she conducted research and lectured publicly on e-Science, Digital Libraries, and Knowledge Communities. In 2007 Anna K. Gold, formerly of MIT and Cal Poly, San Luis Obispo, authored a series of articles in D-Lib Magazine that opened the door for academic libraries to begin exploring roles, skills, and strategies for engaging in e-Science: Cyberinfrastructure, Data, and Libraries, Part 1: A Cyberinfrastructure Primer for Librarians and Cyberinfrastructure, Data, and Libraries, Part 2: Libraries and the Data Challenge: Roles and Actions for Libraries. == Academic research and health sciences libraries == In 2007, the Association of Research Libraries (ARL) e-Science task force issued its report on e-Science and librarianship. The ARL's report encouraged its member libraries to position themselves to engage with researchers involved in e-Science (eScience) by cultivating new research support strategies and developing their digital scholarship infrastructure. E-Science has multiple attributes; Tony and Jessie Hey framed e-Science for the library community by characterizing it as a research methodology: "e-Science is not a new scientific discipline in its own right: e-Science is shorthand for the set of tools and technologies required to support collaborative, networked science". In addition to academic libraries' interests in providing support for their researchers engaging in e-Science, the health sciences library community also emerged as a major proponent for creating librarian positions for supporting the information needs of large-scale, networked, research collaborations on their campuses. Neil Rambo, current director of NYU's Health Sciences Library and former director of University of Washington Health Sciences Library, was the first to use the term in the Journal of the Medical Library Association, in his 2009 editorial e-Science and the Biomedical Library. Rambo's definition of e-Science highlighted the potential e-Science held for creating data as a research product: "E-science is a new research methodology, fueled by networked capabilities and the practical possibility of gathering and storing vast amounts of data." In response to this article the University of Massachusetts Medical School Lamar Soutter Library and National Network of Libraries of Medicine, New England Region encouraged health sciences libraries to cooperate to identify skills and develop a program for training e-Science Librarians. Then, in 2013, Shannon Bohle, an archivist who was employed in the library at Cold Spring Harbor Laboratory, an NCI-designated basic cancer research facility, used experience gained there and previous papers and presentations about preserving scientific archival materials to expand the traditional definition of e-Science by including the terms, principles, and practices used in archival science. These included in the definition the "long-term storage and accessibility of all materials generated through the scientific process," as well as examples of material types traditionally preserved in archives, like "electronic/digitized laboratory notebooks, raw and fitted data sets, manuscript production and draft versions, pre-prints," as well as library materials ("print and/or electronic publications"). == Roles == Many areas of science are about to be transformed by the availability of vast amounts of new scientific data that can potentially provide insights at a level of detail never before envisaged. However, this new data dominant era brings new challenges for the scientists and they will need the skills and technologies both of computer scientists and of the library community to manage, search and curate these new data resources. Libraries will not be immune from change in this new world of research. Karen Williams identifies roles in the following areas for librarians in the developing world of e-Science. Campus Engagement Content/Collection Development and Management Teaching and Learning Scholarly Communication E-Scholarship and Digital Tools Reference/Help Services Outreach Fund Raising Exhibit and Event Planning Leadership == Challenges for research libraries == E-science tends toward inter- and multidisciplinary approaches that depend on computation and computer science. Research libraries have traditionally been discipline focused and, although increasingly technologically sophisticated, do not have systems of the scale or complexity of the e-science environment. E-science is data intensive, but research libraries have not typically been responsible for scientific data. E-science is frequently conducted in a team context, often distributed across multiple institutions and on a global scale. The primary constituency of libraries generally comprises those affiliated with the local institution. Licenses for electronic content are typically restricted to a particular institutional community, and the infrastructure to move institutional licenses into a multi-institutional environment is not well developed. E-science challenges all these traditional paradigms of research library organization and services. == Skills == Garritano & Carlson were among the first to outline a skill set for librarians seeking to support the data needs of e-Science; they identified five skill categories librarians new to this area should expect to adapt or develop when participating on such projects: Library and information science expertise Subject expertise Partnerships and outreach (both internal and external) Participating in sponsored research Balancing workload An example of librarians reconfiguring traditional librarian skills to meet the needs of researchers engaging in e-Science is Witt & Carlson's adaptation of the traditional reference interview into a "data interview" in order to provide effective data management and e-Science services. This interview consists of ten practical queries necessary for understanding the provenance and expectations for the preservation of datasets typical of e-Science that also help illustrate some of the educational tools and skills needed by a librarian new to e-Science. "What is the story of the data? What form and format are the data in? What is the expected lifespan of the dataset? How could the data be used, reused, and repurposed? How large is the dataset, and what is its rate of growth? Who are the potential audiences for the data? Who owns the data? Does the dataset include any sensitive information? What publications or discoveries have resulted from the data? How should the data be made accessible?" == Resources == In 2009 the Lamar Soutter Library at the University of Massachusetts Medical School (UMMS) and the National Network of Libraries of Medicine, New England Region (NN/LM NER) funded an e-Science program for building the skills highlighted above for librarians. Elaine Russo Martin, Director of Library Services at the Lamar Soutter Library and Director of the NN/LM NER developed this comprehensive e-Science program to build librarians' subject expertise in the sciences, developing their data management skills, and their familiarity with cyberinfrastructure and e-Science. Three major products of this program are the e-Science web portal for librarians, the E-Science Symposium, and the New England Collaborative Data Management Curriculum (NECDMC). This portal includes educational resources for specific tools and subject/discipline tutorials and modules to assist librarians new to e-Science. UMMS and NN/LM NER also publish an open access journal called the Journal of eScience Librarianship.

    Read more →
  • EJB QL

    EJB QL

    EJB QL or EJB-QL is a portable database query language for Enterprise Java Beans. It was used in Java EE applications. Compared to SQL, however, it is less complex but less powerful as well. == History == The language has been inspired, especially EJB3-QL, by the native Hibernate Query Language. In EJB3 It has been mostly replaced by the Java Persistence Query Language. == Differences == EJB QL is a database query language similar to SQL. The used queries are somewhat different from relational SQL, as it uses a so-called "abstract schema" of the enterprise beans instead of the relational model. In other words, EJB QL queries do not use tables and their components, but enterprise beans, their persistent state, and their relationships. The result of an SQL query is a set of rows with a fixed number of columns. The result of an EJB QL query is either a single object, a collection of entity objects of a given type, or a collection of values retrieved from CMP fields. One has to understand the data model of enterprise beans in order to write effective queries.

    Read more →
  • Algorithmic management

    Algorithmic management

    Algorithmic management is a term used to describe certain labor management practices in the contemporary digital economy. In scholarly uses, the term was initially coined in 2015 by Min Kyung Lee, Daniel Kusbit, Evan Metsky, and Laura Dabbish to describe the managerial role played by algorithms on the Uber and Lyft platforms, but has since been taken up by other scholars to describe more generally the managerial and organisational characteristics of platform economies. However, digital direction of labor was present in manufacturing already since the 1970s and algorithmic management is becoming increasingly widespread across a wide range of industries. The concept of algorithmic management can be broadly defined as the delegation of managerial functions to algorithmic and automated systems. Algorithmic management has been enabled by "recent advances in digital technologies" which allow for the real-time and "large-scale collection of data" which is then used to "improve learning algorithms that carry out learning and control functions traditionally performed by managers". The term does not refer to a specific underlying technology, and encompasses the design choices, organisational policies, and governance that surround the managerial use of algorithms in workplaces. In the contemporary workplace, firms employ an ecology of accounting devices, such as "rankings, lists, classifications, stars and other symbols' in order to effectively manage their operations and create value without the need for traditional forms of hierarchical control." Many of these devices fall under the label of what is called algorithmic management, and were first developed by companies operating in the sharing economy or gig economy, functioning as effective labor and cost cutting measures. The Data&Society explainer of the term, for example, describes algorithmic management as 'a diverse set of technological tools and techniques that structure the conditions of work and remotely manage workforces. Data&Society also provides a list of five typical features of algorithmic management: Prolific data collection and surveillance of workers through technology; Real-time responsiveness to data that informs management decisions; Automated or semi-automated decision-making; Transfer of performance evaluations to rating systems or other metrics; and The use of "nudges" and penalties to indirectly incentivize worker behaviors. Proponents of algorithmic management claim that it "creates new employment opportunities, better and cheaper consumer services, transparency and fairness in parts of the labour market that are characterised by inefficiency, opacity and capricious human bosses." On the other hand, critics of algorithmic management claim that the practice leads to several issues, especially as it impacts the employment status of workers managed by its new array of tools and techniques. == History of the term == "Algorithmic management" was first described by Lee, Kusbit, Metsky, and Dabbish in 2015 in their study of the Uber and Lyft platforms. In their study, Lee et al. termed "software algorithms that assume managerial functions and surrounding institutional devices that support algorithms in practice" algorithmic management. Software algorithms, it was said, are increasingly used to "allocate, optimize, and evaluate work" by platforms in managing their vast workforces. In Lee et al.'s paper on Uber and Lyft this included the use of algorithms to assign work to drivers, as mechanisms to optimise pricing for services, and as systems for evaluating driver performance. In 2016, Alex Rosenblat and Luke Stark sought to extend on this understanding of algorithmic management "to elucidate on the automated implementation of company policies on the behaviours and practices of Uber drivers." Rosenblat and Stark found in their study that algorithmic management practices contributed to a system beset by power asymmetries, where drivers had little control over "critical aspects of their work", whereas Uber had far greater control over the labor of its drivers. Since this time, studies of algorithmic management have extended the use of the term to describe the management practices of various firms, where, for example, algorithms "are taking over scheduling work in fast food restaurants and grocery stores, using various forms of performance metrics ad even mood... to assign the fastest employees to work in peak times." Algorithmic management is seen to be especially prevalent in gig work on platforms, such as on Upwork and Deliveroo, and in the sharing economy, such as in the case of Airbnb. Furthermore, recent research has defined sub-constructs that fall under the umbrella term of algorithmic management, for example, "algorithmic nudging". A Harvard Business Review article published in 2021 explains: "Companies are increasingly using algorithms to manage and control individuals not by force, but rather by nudging them into desirable behavior — in other words, learning from their personalized data and altering their choices in some subtle way." While the concept builds on nudging theory popularized by University of Chicago economist Richard Thaler and Harvard Law School professor Cass Sunstein, "due to recent advances in AI and machine learning, algorithmic nudging is much more powerful than its non-algorithmic counterpart. With so much data about workers' behavioral patterns at their fingertips, companies can now develop personalized strategies for changing individuals' decisions and behaviors at large scale. These algorithms can be adjusted in real-time, making the approach even more effective." == Relationships with other labor management practices == Algorithmic management has been compared and contrasted with other forms of management, such as Scientific management approaches, as pioneered by Frederick Taylor in the early 1900s. Henri Schildt has called algorithmic management "Scientific management 2.0", where management "is no longer a human practice, but a process embedded in technology." Similarly, Kathleen Griesbach, Adam Reich, Luke Elliott-Negri, and Ruth Milkman suggest that, while "algorithmic control over labor may be relatively new, it replicates many features of older mechanisms of labor control." On the other hand, some commentators have argued that algorithmic management is not simply a new form of Scientific management or digital Taylorism, but represents a distinct approach to labor control in platform economies. David Stark and Ivana Pais, for example, state that, "In contrast to Scientific Management at the turn of the twentieth century, in the algorithmic management of the twenty-first century there are rules but these are not bureaucratic, there are rankings but not ranks, and there is monitoring but it is not disciplinary. Algorithmic management does not automate bureaucratic structures and practices to create some new form of algorithmic bureaucracy. Whereas the devices and practices of Taylorism were part of a system of hierarchical supervision, the devices and practices of algorithmic management take place within a different economy of attention and a new regime of visibility. Triangular rather than vertical, and not as a panopticon, the lines of vision in algorithmic management are not lines of supervision." Similarly, Data&Society's explainer for algorithmic management claims that the practice represents a marked departure from earlier management structures that more strongly rely on human supervisors to direct workers. In analyzing the difference and the similarities to previous management styles, David Stark and Pieter Vanden Broeck expand the applicability of algorithmic management beyond the workplace. They develop a theory of algorithmic management in terms of broader changes in the shape and structure of organization in the 21st century, attentive to the erosion of organization's boundaries whereby heterogeneous actors, assets, and activities, are coopted regardless of their place in organizational space. Stark and Vanden Broeck propose the following means of differentiating algorithmic management from other historical managerial paradigms: == Issues == Algorithmic management can provide an effective and efficient means of workforce control and value creation in the contemporary digital economy. However, commentators have highlighted several issues that algorithmic management poses, especially for the workers it manages. Criticisms of the practice often highlight several key issues pertaining to algorithmic management practices, such as the imperfection and scope of its surveillance and control measures, which also threaten to lock workers out of key decision-making processes; its lack of transparency for users and information asymmetries; its potential for bias and discrimination; its dehumanizing tendencies; and its potential to create conditions which sidestep traditional employer-employee accountability. This last point has been especi

    Read more →
  • View model

    View model

    A view model or viewpoints framework in systems engineering, software engineering, and enterprise engineering is a framework which defines a coherent set of views to be used in the construction of a system architecture, software architecture, or enterprise architecture. A view is a representation of the whole system from the perspective of a related set of concerns. Since the early 1990s there have been a number of efforts to prescribe approaches for describing and analyzing system architectures. A result of these efforts have been to define a set of views (or viewpoints). They are sometimes referred to as architecture frameworks or enterprise architecture frameworks, but are usually called "view models". Usually a view is a work product that presents specific architecture data for a given system. However, the same term is sometimes used to refer to a view definition, including the particular viewpoint and the corresponding guidance that defines each concrete view. The term view model is related to view definitions. == Overview == The purpose of views and viewpoints is to enable humans to comprehend very complex systems, to organize the elements of the problem and the solution around domains of expertise and to separate concerns. In the engineering of physically intensive systems, viewpoints often correspond to capabilities and responsibilities within the engineering organization. Most complex system specifications are so extensive that no single individual can fully comprehend all aspects of the specifications. Furthermore, we all have different interests in a given system and different reasons for examining the system's specifications. A business executive will ask different questions of a system make-up than would a system implementer. The concept of viewpoints framework, therefore, is to provide separate viewpoints into the specification of a given complex system in order to facilitate communication with the stakeholders. Each viewpoint satisfies an audience with interest in a particular set of aspects of the system. Each viewpoint may use a specific viewpoint language that optimizes the vocabulary and presentation for the audience of that viewpoint. Viewpoint modeling has become an effective approach for dealing with the inherent complexity of large distributed systems. Architecture description practices, as described in IEEE Std 1471-2000, utilize multiple views to address several areas of concerns, each one focusing on a specific aspect of the system. Examples of architecture frameworks using multiple views include Kruchten's "4+1" view model, the Zachman Framework, TOGAF, DoDAF, and RM-ODP. == History == In the 1970s, methods began to appear in software engineering for modeling with multiple views. Douglas T. Ross and K.E. Schoman in 1977 introduce the constructs context, viewpoint, and vantage point to organize the modeling process in systems requirements definition. According to Ross and Schoman, a viewpoint "makes clear what aspects are considered relevant to achieving ... the overall purpose [of the model]" and determines How do we look at [a subject being modelled]? As examples of viewpoints, the paper offers: Technical, Operational and Economic viewpoints. In 1992, Anthony Finkelstein and others published a very important paper on viewpoints. In that work: "A viewpoint can be thought of as a combination of the idea of an “actor”, “knowledge source”, “role” or “agent” in the development process and the idea of a “view” or “perspective” which an actor maintains." An important idea in this paper was to distinguish "a representation style, the scheme and notation by which the viewpoint expresses what it can see" and "a specification, the statements expressed in the viewpoint's style describing particular domains". Subsequent work, such as IEEE 1471, preserved this distinction by utilizing two separate terms: viewpoint and view, respectively. Since the early 1990s there have been a number of efforts to codify approaches for describing and analyzing system architectures. These are often termed architecture frameworks or sometimes viewpoint sets. Many of these have been funded by the United States Department of Defense, but some have sprung from international or national efforts in ISO or the IEEE. Among these, the IEEE Recommended Practice for Architectural Description of Software-Intensive Systems (IEEE Std 1471-2000) established useful definitions of view, viewpoint, stakeholder and concern and guidelines for documenting a system architecture through the use of multiple views by applying viewpoints to address stakeholder concerns. The advantage of multiple views is that hidden requirements and stakeholder disagreements can be discovered more readily. However, studies show that in practice, the added complexity of reconciling multiple views can undermine this advantage. IEEE 1471 (now ISO/IEC/IEEE 42010:2011, Systems and software engineering — Architecture description) prescribes the contents of architecture descriptions and describes their creation and use under a number of scenarios, including precedented and unprecedented design, evolutionary design, and capture of design of existing systems. In all of these scenarios the overall process is the same: identify stakeholders, elicit concerns, identify a set of viewpoints to be used, and then apply these viewpoint specifications to develop the set of views relevant to the system of interest. Rather than define a particular set of viewpoints, the standard provides uniform mechanisms and requirements for architects and organizations to define their own viewpoints. In 1996 the ISO Reference Model for Open Distributed Processing (RM-ODP) was published to provide a useful framework for describing the architecture and design of large-scale distributed systems. == View model topics == === View === A view of a system is a representation of the system from the perspective of a viewpoint. This viewpoint on a system involves a perspective focusing on specific concerns regarding the system, which suppresses details to provide a simplified model having only those elements related to the concerns of the viewpoint. For example, a security viewpoint focuses on security concerns and a security viewpoint model contains those elements that are related to security from a more general model of a system. A view allows a user to examine a portion of a particular interest area. For example, an Information View may present all functions, organizations, technology, etc. that use a particular piece of information, while the Organizational View may present all functions, technology, and information of concern to a particular organization. In the Zachman Framework views comprise a group of work products whose development requires a particular analytical and technical expertise because they focus on either the “what,” “how,” “who,” “where,” “when,” or “why” of the enterprise. For example, Functional View work products answer the question “how is the mission carried out?” They are most easily developed by experts in functional decomposition using process and activity modeling. They show the enterprise from the point of view of functions. They also may show organizational and information components, but only as they relate to functions. === Viewpoints === In systems engineering, a viewpoint is a partitioning or restriction of concerns in a system. Adoption of a viewpoint is usable so that issues in those aspects can be addressed separately. A good selection of viewpoints also partitions the design of the system into specific areas of expertise. Viewpoints provide the conventions, rules, and languages for constructing, presenting and analysing views. In ISO/IEC 42010:2007 (IEEE-Std-1471-2000) a viewpoint is a specification for an individual view. A view is a representation of a whole system from the perspective of a viewpoint. A view may consist of one or more architectural models. Each such architectural model is developed using the methods established by its associated architectural system, as well as for the system as a whole. === Modeling perspectives === Modeling perspectives is a set of different ways to represent pre-selected aspects of a system. Each perspective has a different focus, conceptualization, dedication and visualization of what the model is representing. In information systems, the traditional way to divide modeling perspectives is to distinguish the structural, functional and behavioral/processual perspectives. This together with rule, object, communication and actor and role perspectives is one way of classifying modeling approaches === Viewpoint model === In any given viewpoint, it is possible to make a model of the system that contains only the objects that are visible from that viewpoint, but also captures all of the objects, relationships and constraints that are present in the system and relevant to that viewpoint. Such a model is said to be a viewpoint model, or a view of the

    Read more →
  • Species distribution modelling

    Species distribution modelling

    Species distribution modelling (SDM), also known as environmental (or ecological) niche modelling (ENM), habitat suitability modelling, predictive habitat distribution modelling, and range mapping uses ecological models to predict the distribution of a species across geographic space and time using environmental data. The environmental data are most often climate data (e.g. temperature, precipitation), but can include other variables such as soil type, water depth, and land cover. SDMs are used in several research areas in conservation biology, ecology and evolution. These models can be used to understand how environmental conditions influence the occurrence or abundance of a species, and for predictive purposes (ecological forecasting). Predictions from an SDM may be of a species' future distribution under climate change, a species' past distribution in order to assess evolutionary relationships, or the potential future distribution of an invasive species. Predictions of current and/or future habitat suitability can be useful for management applications (e.g. reintroduction or translocation of vulnerable species, reserve placement in anticipation of climate change). There are two main types of SDMs. Correlative SDMs, also known as climate envelope models, bioclimatic models, or resource selection function models, model the observed distribution of a species as a function of environmental conditions. Mechanistic SDMs, also known as process-based models or biophysical models, use independently derived information about a species' physiology to develop a model of the environmental conditions under which the species can exist. The extent to which such modelled data reflect real-world species distributions will depend on a number of factors, including the nature, complexity, and accuracy of the models used and the quality of the available environmental data layers; the availability of sufficient and reliable species distribution data as model input; and the influence of various factors such as barriers to dispersal, geologic history, or biotic interactions, that increase the difference between the realized niche and the fundamental niche. Environmental niche modelling may be considered a part of the discipline of biodiversity informatics. == History == A. F. W. Schimper used geographical and environmental factors to explain plant distributions in his 1898 Pflanzengeographie auf physiologischer Grundlage (Plant Geography Upon a Physiological Basis) and his 1908 work of the same name. Andrew Murray used the environment to explain the distribution of mammals in his 1866 The Geographical Distribution of Mammals. Robert Whittaker's work with plants and Robert MacArthur's work with birds strongly established the role the environment plays in species distributions. Elgene O. Box constructed environmental envelope models to predict the range of tree species. His computer simulations were among the earliest uses of species distribution modelling. The adoption of more sophisticated generalised linear models (GLMs) made it possible to create more sophisticated and realistic species distribution models. The expansion of remote sensing and the development of GIS-based environmental modelling increase the amount of environmental information available for model-building and made it easier to use. == Correlative vs mechanistic models == === Correlative SDMs === SDMs originated as correlative models. Correlative SDMs model the observed distribution of a species as a function of geographically referenced climatic predictor variables using multiple regression approaches. Given a set of geographically referred observed presences of a species and a set of climate maps, a model defines the most likely environmental ranges within which a species lives. Correlative SDMs assume that species are at equilibrium with their environment and that the relevant environmental variables have been adequately sampled. The models allow for interpolation between a limited number of species occurrences. For these models to be effective, it is required to gather observations not only of species presences, but also of absences, that is, where the species does not live. Records of species absences are typically not as common as records of presences, thus often "random background" or "pseudo-absence" data are used to fit these models. If there are incomplete records of species occurrences, pseudo-absences can introduce bias. Since correlative SDMs are models of a species' observed distribution, they are models of the realized niche (the environments where a species is found), as opposed to the fundamental niche (the environments where a species can be found, or where the abiotic environment is appropriate for the survival). For a given species, the realized and fundamental niches might be the same, but if a species is geographically confined due to dispersal limitation or species interactions, the realized niche will be smaller than the fundamental niche. Correlative SDMs are easier and faster to implement than mechanistic SDMs, and can make ready use of available data. Since they are correlative however, they do not provide much information about causal mechanisms and are not good for extrapolation. They will also be inaccurate if the observed species range is not at equilibrium (e.g. if a species has been recently introduced and is actively expanding its range). In standard SDMs, the distribution of a single species is often modeled, with unique parameters describing how environmental (abiotic) factors influence its occurrence probability. This allows for differentiated responses to environmental drivers among species, but can be problematic for data-deficient species. In contrast, similarities in environmental responses can be accounted for in multi-species SDMs, which model several species jointly using shared or hierarchically related parameters. However, neither approach explicitly accounts for community-level biotic interactions, which can be important in explaining species diversity patterns. Joint species distribution models (joint SDMs or J-SDMs) address this by modeling species co-occurrence patterns directly. The occurrence probability of a given species is thus influenced not only by abiotic drivers but also by inferred biotic associations with other species. This can improve accuracy for rarer taxa and provide insights into community ecology. Both standard SDMs and J-SDMs can be used to generate community-level metrics, such as species richness, by aggregating outputs across multiple species. These can be important for decision-making such as conservation planning. === Mechanistic SDMs === Mechanistic SDMs are more recently developed. In contrast to correlative models, mechanistic SDMs use physiological information about a species (taken from controlled field or laboratory studies) to determine the range of environmental conditions within which the species can persist. These models aim to directly characterize the fundamental niche, and to project it onto the landscape. A simple model may simply identify threshold values outside of which a species can't survive. A more complex model may consist of several sub-models, e.g. micro-climate conditions given macro-climate conditions, body temperature given micro-climate conditions, fitness or other biological rates (e.g. survival, fecundity) given body temperature (thermal performance curves), resource or energy requirements, and population dynamics. Geographically referenced environmental data are used as model inputs. Because the species distribution predictions are independent of the species' known range, these models are especially useful for species whose range is actively shifting and not at equilibrium, such as invasive species. Mechanistic SDMs incorporate causal mechanisms and are better for extrapolation and non-equilibrium situations. However, they are more labor-intensive to create than correlational models and require the collection and validation of a lot of physiological data, which may not be readily available. The models require many assumptions and parameter estimates, and they can become very complicated. Dispersal, biotic interactions, and evolutionary processes present challenges, as they aren't usually incorporated into either correlative or mechanistic models. Correlational and mechanistic models can be used in combination to gain additional insights. For example, a mechanistic model could be used to identify areas that are clearly outside the species' fundamental niche, and these areas can be marked as absences or excluded from analysis. See for a comparison between mechanistic and correlative models. == Niche models (correlative) == There are a variety of mathematical methods that can be used for fitting, selecting, and evaluating correlative SDMs. Models include "profile" methods, which are simple statistical techniques that use e.g. environmental distance to known sites of occurrence such as

    Read more →
  • Object storage

    Object storage

    Object storage (also known as object-based storage or blob storage) is a computer data storage approach that manages data as "blobs" or "objects", as opposed to other storage architectures like file systems, which manage data as a file hierarchy, and block storage, which manages data as blocks within sectors and tracks. Each object is typically associated with a variable amount of metadata, and a globally unique identifier. Object storage can be implemented at multiple levels, including the device level (object-storage device), the system level, and the interface level. In each case, object storage seeks to enable capabilities not addressed by other storage architectures, like interfaces that are directly programmable by the application, a namespace that can span multiple instances of physical hardware, and data-management functions like data replication and data distribution at object-level granularity. Object storage systems allow retention of massive amounts of unstructured data in which data is written once and read once (or many times). Object storage is used for purposes such as storing objects like videos and photos on Facebook, songs on Spotify, or files in online collaboration services, such as Dropbox. One of the limitations with object storage is that it is not intended for transactional data, as object storage was not designed to replace NAS file access and sharing; it does not support the locking and sharing mechanisms needed to maintain a single, accurately updated version of a file. == History == === Origins === Jim Starkey coined the term blob working at Digital Equipment Corporation to refer to opaque data entities. The terminology was adopted for Rdb/VMS. Blob is often humorously explained to be an abbreviation for binary large object. According to Starkey, this backronym arose when Terry McKiever, working in marketing at Apollo Computer felt that the term needed to be an abbreviation. McKiever began using the expansion basic large object. This was later eclipsed by the retroactive explanation of blobs as binary large objects. According to Starkey, "Blob don't stand for nothin'." Rejecting the acronym, he explained his motivation behind the coinage, saying, "A blob is the thing that ate Cincinnatti [sic], Cleveland, or whatever", referring to the 1958 science fiction film The Blob. In 1995, research led by Garth Gibson on Network-Attached Secure Disks first promoted the concept of splitting less common operations, like namespace manipulations, from common operations, like reads and writes, to optimize the performance and scale of both. In the same year, a Belgian company – FilePool – was established to build the basis for archiving functions. Object storage was proposed at Gibson's Carnegie Mellon University lab as a research project in 1996. Another key concept was abstracting the writes and reads of data to more flexible data containers (objects). Fine grained access control through object storage architecture was further described by one of the NASD team, Howard Gobioff, who later was one of the inventors of the Google File System. Other related work includes the Coda filesystem project at Carnegie Mellon, which started in 1987, and spawned the Lustre file system. There is also the OceanStore project at UC Berkeley, which started in 1999 and the Logistical Networking project at the University of Tennessee Knoxville, which started in 1998. In 1999, Gibson founded Panasas to commercialize the concepts developed by the NASD team. === Development === Seagate Technology played a central role in the development of object storage. According to the Storage Networking Industry Association (SNIA), "Object storage originated in the late 1990s: Seagate specifications from 1999 Introduced some of the first commands and how operating system effectively removed from consumption of the storage." A preliminary version of the "OBJECT BASED STORAGE DEVICES Command Set Proposal" dated 10/25/1999 was submitted by Seagate as edited by Seagate's Dave Anderson and was the product of work by the National Storage Industry Consortium (NSIC) including contributions by Carnegie Mellon University, Seagate, IBM, Quantum, and StorageTek. This paper was proposed to INCITS T-10 (International Committee for Information Technology Standards) with a goal to form a committee and design a specification based on the SCSI interface protocol. This defined objects as abstracted data, with unique identifiers and metadata, how objects related to file systems, along with many other innovative concepts. Anderson presented many of these ideas at the SNIA conference in October 1999. The presentation revealed an IP Agreement that had been signed in February 1997 between the original collaborators (with Seagate represented by Anderson and Chris Malakapalli) and covered the benefits of object storage, scalable computing, platform independence, and storage management. == Architecture == === Abstraction of storage === One of the design principles of object storage is to abstract some of the lower layers of storage away from the administrators and applications. Thus, data is exposed and managed as objects instead of blocks or (exclusively) files. Objects contain additional descriptive properties which can be used for better indexing or management. Administrators do not have to perform lower-level storage functions like constructing and managing logical volumes to utilize disk capacity or setting RAID levels to deal with disk failure. Object storage also allows the addressing and identification of individual objects by more than just file name and file path. Object storage adds a unique identifier within a bucket, or across the entire system, to support much larger namespaces and eliminate name collisions. === Inclusion of rich custom metadata within the object === Object storage explicitly separates file metadata from data to support additional capabilities. As opposed to fixed metadata in file systems (filename, creation date, type, etc.), object storage provides for full function, custom, object-level metadata in order to: Capture application-specific or user-specific information for better indexing purposes Support data-management policies (e.g. a policy to drive object movement from one storage tier to another) Centralize management of storage across many individual nodes and clusters Optimize metadata storage (e.g. encapsulated, database or key value storage) and caching/indexing (when authoritative metadata is encapsulated with the metadata inside the object) independently from the data storage (e.g. unstructured binary storage) Additionally, in some object-based file-system implementations: The file system clients only contact metadata servers once when the file is opened and then get content directly via object-storage servers (vs. block-based file systems which would require constant metadata access) Data objects can be configured on a per-file basis to allow adaptive stripe width, even across multiple object-storage servers, supporting optimizations in bandwidth and I/O Object-based storage devices (OSD) as well as some software implementations (e.g., DataCore Swarm) manage metadata and data at the storage device level: Instead of providing a block-oriented interface that reads and writes fixed sized blocks of data, data is organized into flexible-sized data containers, called objects Each object has both data (an uninterpreted sequence of bytes) and metadata (an extensible set of attributes describing the object); physically encapsulating both together benefits recoverability. The command interface includes commands to create and delete objects, write bytes and read bytes to and from individual objects, and to set and get attributes on objects Security mechanisms provide per-object and per-command access control === Programmatic data management === Object storage provides programmatic interfaces to allow applications to manipulate data. At the base level, this includes Create, read, update and delete (CRUD) functions for basic read, write and delete operations. Some object storage implementations go further, supporting additional functionality like object/file versioning, object replication, life-cycle management and movement of objects between different tiers and types of storage. Most API implementations are REST-based, allowing the use of many standard HTTP calls. == Implementation == === Cloud storage === The vast majority of cloud storage available in the market leverages an object-storage architecture. Some notable examples are Amazon S3, which debuted in March 2006, Microsoft Azure Blob Storage, IBM Cloud Object Storage, Rackspace Cloud Files (whose code was donated in 2010 to Openstack project and released as OpenStack Swift), and Google Cloud Storage released in May 2010. === Object-based file systems === Some distributed file systems use an object-based architecture, where file metadata is stored in metadata servers and file data is stored i

    Read more →
  • Metadata

    Metadata

    Metadata (or metainformation) is data (or information) that defines and describes the characteristics of other data. It often helps to describe, explain, locate, or otherwise make data easier to retrieve, use, or manage. For example, the title, author, and publication date of a book are metadata about the book. But, while a data asset is finite, its metadata is infinite. As such, efforts to define, classify types, or structure metadata are expressed as examples in the context of its use. The term "metadata" has a history dating to the 1960s where it occurred in computer science and in popular culture. Different types of metadata serve different functions. For example, descriptive metadata for a document might include the author, creation date, file size and keywords. Metadata has various purposes. It can help users find relevant information and discover resources. It can also help organize electronic resources, provide digital identification, and archive and preserve resources. Metadata allows users to access resources by "allowing resources to be found by relevant criteria, identifying resources, bringing similar resources together, distinguishing dissimilar resources, and giving location information". Metadata of telecommunication activities including Internet traffic is very widely collected by various national governmental organizations. This data is used for the purposes of traffic analysis and can be used for mass surveillance. Unique metadata standards exist for different disciplines (e.g., museum collections, digital audio files, websites, etc.). Describing the contents and context of data or data files increases its usefulness. For example, a web page may include metadata specifying what software language the page is written in (e.g., HTML), what tools were used to create it, what subjects the page is about, and where to find more information about the subject. This metadata can automatically improve the reader's experience and make it easier for users to find the web page online. A CD may include metadata providing information about the musicians, singers, and songwriters whose work appears on the disc. In many countries, government organizations routinely store metadata about emails, telephone calls, web pages, video traffic, IP connections, and cell phone locations. == Types == There are many distinct types of metadata, including: Descriptive metadata – the descriptive information about a resource. It is used for discovery and identification. It includes elements such as title, abstract, author, and keywords. Structural metadata – metadata about containers of data and indicates how compound objects are put together, for example, how pages are ordered to form chapters. It describes the types, versions, relationships, and other characteristics of digital materials. Administrative metadata – the information to help manage a resource, like resource type, and permissions, and when and how it was created. Reference metadata – the information about the contents and quality of statistical data. Statistical metadata – also called process data, may describe processes that collect, process, or produce statistical data. Legal metadata – provides information about the creator, copyright holder, and public licensing, if provided. Metadata is not strictly bound to one of these categories, as it can describe a piece of data in many other ways. While the metadata application is manifold, covering a large variety of fields, there are specialized and well-accepted models to specify types of metadata. Bretherton & Singley (1994) distinguish between two distinct classes: structural/control metadata and guide metadata. Structural metadata describes the structure of database objects such as tables, columns, keys and indexes. Guide metadata helps humans find specific items and is usually expressed as a set of keywords in a natural language. According to Ralph Kimball, metadata can be divided into three categories: technical metadata (or internal metadata), business metadata (or external metadata), and process metadata. Dan Linstedt, creator of the data vault methodology, says business metadata "...provide[s] definition of the functionality, definition of the data, definition of the elements, and definition of how the data is used within business...business metadata includes business requirements, time-lines, business metrics, business process flows, and business terminology." Business metadata is important because it can greatly facilitate the usefulness of the data to business people. A simple example of business metadata is a glossary entry. Hover functionality in an application or web form can enable a glossary definition to be shown when cursor is on a field or term. Other examples of business metadata include annotation ability within applications. For example, a business user may be viewing a business intelligence (BI) report and notice a trend in the data. The user may have background knowledge as to why this trend occurs. Some business intelligence tools enable the user to create an annotation within the report that explains the trend. Such an annotation can enhance other users' understanding of the data. This example is especially powerful because it is created by a business user for the use of other business people. NISO distinguishes three types of metadata: descriptive, structural, and administrative. Descriptive metadata is typically used for discovery and identification, as information to search and locate an object, such as title, authors, subjects, keywords, and publisher. Structural metadata describes how the components of an object are organized. An example of structural metadata would be how pages are ordered to form chapters of a book. Finally, administrative metadata gives information to help manage the source. Administrative metadata refers to the technical information, such as file type, or when and how the file was created. Two sub-types of administrative metadata are rights management metadata and preservation metadata. Rights management metadata explains intellectual property rights, while preservation metadata contains information to preserve and save a resource. Statistical data repositories have their own requirements for metadata in order to describe not only the source and quality of the data but also what statistical processes were used to create the data, which is of particular importance to the statistical community in order to both validate and improve the process of statistical data production. An additional type of metadata beginning to be more developed is accessibility metadata. Accessibility metadata is not a new concept to libraries; however, advances in universal design have raised its profile. Projects like Cloud4All and GPII identified the lack of common terminologies and models to describe the needs and preferences of users and information that fits those needs as a major gap in providing universal access solutions. Those types of information are accessibility metadata. The Schema.org website has incorporated several accessibility properties based on IMS Global Access for All Information Model Data Element Specification. While the efforts to describe and standardize the varied accessibility needs of information seekers are beginning to become more robust, their adoption into established metadata schemas has not been as developed. For example, while Dublin Core (DC)'s "audience" and MARC 21's "reading level" could be used to identify resources suitable for users with dyslexia and DC's "format" could be used to identify resources available in braille, audio, or large print formats, there is more work to be done. == History == Metadata was traditionally used in the card catalogs of libraries until the 1980s when libraries converted their catalog data to digital databases. In the 2000s, as data and information were increasingly stored digitally, this digital data was described using metadata standards. An early description of "meta data" for computer systems was written by David Griffel and Stuart McIntosh at the MIT Center for International Studies in 1967: "In summary then, we have statements in an object language about subject descriptions of data and token codes for the data. We also have statements in a meta language describing the data relationships and transformations, and ought/is relations between norm and data." == Definition == Metadata means "data about data". Metadata is defined as the data providing information about one or more aspects of the data; it is used to summarize basic information about data that can make tracking and working with specific data easier. Some examples include: Means of creation of the data Source of the data Time and date of creation Creator or author of the data Location on a computer network where the data was created Standards used Data quality For example, a digital image may include metadata that describes the size of the image, its color depth, resolution,

    Read more →
  • Graph cut optimization

    Graph cut optimization

    Graph cut optimization is a combinatorial optimization method applicable to a family of functions of discrete variables, named after the concept of cut in the theory of flow networks. Thanks to the max-flow min-cut theorem, determining the minimum cut over a graph representing a flow network is equivalent to computing the maximum flow over the network. Given a pseudo-Boolean function f {\displaystyle f} , if it is possible to construct a flow network with positive weights such that each cut C {\displaystyle C} of the network can be mapped to an assignment of variables x {\displaystyle \mathbf {x} } to f {\displaystyle f} (and vice versa), and the cost of C {\displaystyle C} equals f ( x ) {\displaystyle f(\mathbf {x} )} (up to an additive constant) then it is possible to find the global optimum of f {\displaystyle f} in polynomial time by computing a minimum cut of the graph. The mapping between cuts and variable assignments is done by representing each variable with one node in the graph and, given a cut, each variable will have a value of 0 if the corresponding node belongs to the component connected to the source, or 1 if it belong to the component connected to the sink. Not all pseudo-Boolean functions can be represented by a flow network, and in the general case the global optimization problem is NP-hard. There exist sufficient conditions to characterise families of functions that can be optimised through graph cuts, such as submodular quadratic functions. Graph cut optimization can be extended to functions of discrete variables with a finite number of values, that can be approached with iterative algorithms with strong optimality properties, computing one graph cut at each iteration. Graph cut optimization is an important tool for inference over graphical models such as Markov random fields or conditional random fields, and it has applications in computer vision problems such as image segmentation, denoising, registration and stereo matching. == Representability == A pseudo-Boolean function f : { 0 , 1 } n → R {\displaystyle f:\{0,1\}^{n}\to \mathbb {R} } is said to be representable if there exists a graph G = ( V , E ) {\displaystyle G=(V,E)} with non-negative weights and with source and sink nodes s {\displaystyle s} and t {\displaystyle t} respectively, and there exists a set of nodes V 0 = { v 1 , … , v n } ⊂ V − { s , t } {\displaystyle V_{0}=\{v_{1},\dots ,v_{n}\}\subset V-\{s,t\}} such that, for each tuple of values ( x 1 , … , x n ) ∈ { 0 , 1 } n {\displaystyle (x_{1},\dots ,x_{n})\in \{0,1\}^{n}} assigned to the variables, f ( x 1 , … , x n ) {\displaystyle f(x_{1},\dots ,x_{n})} equals (up to a constant) the value of the flow determined by a minimum cut C = ( S , T ) {\displaystyle C=(S,T)} of the graph G {\displaystyle G} such that v i ∈ S {\displaystyle v_{i}\in S} if x i = 0 {\displaystyle x_{i}=0} and v i ∈ T {\displaystyle v_{i}\in T} if x i = 1 {\displaystyle x_{i}=1} . It is possible to classify pseudo-Boolean functions according to their order, determined by the maximum number of variables contributing to each single term. All first order functions, where each term depends upon at most one variable, are always representable. Quadratic functions f ( x ) = w 0 + ∑ i w i ( x i ) + ∑ i < j w i j ( x i , x j ) . {\displaystyle f(\mathbf {x} )=w_{0}+\sum _{i}w_{i}(x_{i})+\sum _{i 0 {\displaystyle p>0} then w i j k ( x i , x j , x k ) = w i j k ( 0 , 0 , 0 ) + p 1 ( x i − 1 ) + p 2 ( x j − 1 ) + p 3 ( x k − 1 ) + p 23 ( x j − 1 ) x k + p 31 x i ( x k − 1 ) + p 12 ( x i − 1 ) x j − p x i x j x k {\displaystyle w_{ijk}(x_{i},x_{j},x_{k})=w_{ijk}(0,0,0)+p_{1}(x_{i}-1)+p_{2}(x_{j}-1)+p_{3}(x_{k}-1)+p_{23}(x_{j}-1)x_{k}+p_{31}x_{i}(x_{k}-1)+p_{12}(x_{i}-1)x_{j}-px_{i}x_{j}x_{k}} with p 1 = w i j k ( 1 , 0 , 1 ) − w i j k ( 0 , 0 , 1 ) p 2 = w i j k ( 1 , 1 , 0 ) − w i j k ( 1 , 0 , 1 ) p 3 = w i j k ( 0 , 1 , 1 ) − w i j k ( 0 , 1 , 0 ) p 23 = w i j k ( 0 , 0 , 1 ) + w i j k ( 0 , 1 , 0 ) − w i j k ( 0 , 0 , 0 ) − w i j k ( 0 , 1 , 1 ) p 31 = w i j k ( 0 , 0 , 1 ) + w i j k ( 1 , 0 , 0 ) − w i j k ( 0 , 0 , 0 ) − w i j k ( 1 , 0 , 1 ) p 12 = w i j k ( 0 , 1 , 0 ) + w i j k ( 1 , 0 , 0 ) − w i j k ( 0 , 0 , 0 ) − w i j k ( 1 , 1 , 0 ) . {\displaystyle {\begin{aligned}p_{1}&=w_{ijk}(1,0,1)-w_{ijk}(0,0,1)\\p_{2}&=w_{ijk}(1,1,0)-w_{ijk}(1,0,1)\\p_{3}&=w_{ijk}(0,1,1)-w_{ijk}(0,1,0)\\p_{23}&=w_{ijk}(0,0,1)+w_{ijk}(0,1,0)-w_{ijk}(0,0,0)-w_{ijk}(0,1,1)\\p_{31}&=w_{ijk}(0,0,1)+w_{ijk}(1,0,0)-w_{ijk}(0,0,0)-w_{ijk}(1,0,1)\\p_{12}&=w_{ijk}(0,1,0)+w_{ijk}(1,0,0)-w_{ijk}(0,0,0)-w_{ijk}(1,1

    Read more →
  • Media aggregation platform

    Media aggregation platform

    A Media Aggregation Platform or Media Aggregation Portal (MAP) is an over the top service for distributing web-based streaming media content from multiple sources to a large audience. MAPs consist of networks of sources who host their own content which viewers can choose and access directly from a larger variety of content to choose from than a single source can offer. The service is used by content providers, looking to extend the reach of their content. Unlike multichannel video programming distributor (MVPD) or multiple-system operators (MSO), MAPs rely on the Internet rather than cables or satellite. As more network television channels have moved online in the early 21st century, joining web-native channels like Netflix, MAPs aggregate content the way that MSOs and MVPDs have used cable, and to a lesser extent satellite and IPTV infrastructure. There are companies that offer a similar service for free, including Yidio and StreamingMoviesRight, while others charge a subscription fee like as FreeCast Inc's Rabbit TV Plus. When compared with MSOs and MVPDs, MAP networks have much lower costs due to lack of physical infrastructure. The majority of revenue from MAP services are retained by the content creators, and revenue is instead collected from advertisements, pay-per-view, and subscription-based content offerings instead of licensing and reselling content. MAP service consumers interact and purchase content directly from its source, without the markup added by a middleman.

    Read more →
  • Randomized rounding

    Randomized rounding

    In computer science and operations research, randomized rounding is a widely used approach for designing and analyzing approximation algorithms. Many combinatorial optimization problems are computationally intractable to solve exactly (to optimality). For such problems, randomized rounding can be used to design fast (polynomial time) approximation algorithms—that is, algorithms that are guaranteed to return an approximately optimal solution given any input. The basic idea of randomized rounding is to convert an optimal solution of a relaxation of the problem into an approximately-optimal solution to the original problem. The resulting algorithm is usually analyzed using the probabilistic method. == Overview == The basic approach has three steps: Formulate the problem to be solved as an integer linear program (ILP). Compute an optimal fractional solution x {\displaystyle x} to the linear programming relaxation (LP) of the ILP. Round the fractional solution x {\displaystyle x} of the LP to an integer solution x ′ {\displaystyle x'} of the ILP. (Although the approach is most commonly applied with linear programs, other kinds of relaxations are sometimes used. For example, see Goemans' and Williamson's semidefinite programming-based Max-Cut approximation algorithm.) In the first step, the challenge is to choose a suitable integer linear program. Familiarity with linear programming, in particular modelling using linear programs and integer linear programs, is required. For many problems, there is a natural integer linear program that works well, such as in the Set Cover example below. (The integer linear program should have a small integrality gap; indeed randomized rounding is often used to prove bounds on integrality gaps.) In the second step, the optimal fractional solution can typically be computed in polynomial time using any standard linear programming algorithm. In the third step, the fractional solution must be converted into an integer solution (and thus a solution to the original problem). This is called rounding the fractional solution. The resulting integer solution should (provably) have cost not much larger than the cost of the fractional solution. This will ensure that the cost of the integer solution is not much larger than the cost of the optimal integer solution. The main technique used to do the third step (rounding) is to use randomization, and then to use probabilistic arguments to bound the increase in cost due to the rounding (following the probabilistic method from combinatorics). Therein, probabilistic arguments are used to show the existence of discrete structures with desired properties. In this context, one uses such arguments to show the following: Given any fractional solution x {\displaystyle x} of the LP, with positive probability the randomized rounding process produces an integer solution x ′ {\displaystyle x'} that approximates x {\displaystyle x} according to some desired criterion. Finally, to make the third step computationally efficient, one either shows that x ′ {\displaystyle x'} approximates x {\displaystyle x} with high probability (so that the step can remain randomized) or one derandomizes the rounding step, typically using the method of conditional probabilities. The latter method converts the randomized rounding process into an efficient deterministic process that is guaranteed to reach a good outcome. == Example: the set cover problem == The following example illustrates how randomized rounding can be used to design an approximation algorithm for the set cover problem. Fix any instance ⟨ c , S ⟩ {\displaystyle \langle c,{\mathcal {S}}\rangle } of set cover over a universe U {\displaystyle {\mathcal {U}}} . === Computing the fractional solution === For step 1, let IP be the standard integer linear program for set cover for this instance. For step 2, let LP be the linear programming relaxation of IP, and compute an optimal solution x ∗ {\displaystyle x^{}} to LP using any standard linear programming algorithm. This takes time polynomial in the input size. The feasible solutions to LP are the vectors x {\displaystyle x} that assign each set s ∈ S {\displaystyle s\in {\mathcal {S}}} a non-negative weight x s {\displaystyle x_{s}} , such that, for each element e ∈ U {\displaystyle e\in {\mathcal {U}}} , x ′ {\displaystyle x'} covers e {\displaystyle e} —the total weight assigned to the sets containing e {\displaystyle e} is at least 1, that is, ∑ s ∋ e x s ≥ 1. {\displaystyle \sum _{s\ni e}x_{s}\geq 1.} The optimal solution x ∗ {\displaystyle x^{}} is a feasible solution whose cost ∑ s ∈ S c ( S ) x s ∗ {\displaystyle \sum _{s\in {\mathcal {S}}}c(S)x_{s}^{}} is as small as possible. Note that any set cover C {\displaystyle {\mathcal {C}}} for S {\displaystyle {\mathcal {S}}} gives a feasible solution x {\displaystyle x} (where x s = 1 {\displaystyle x_{s}=1} for s ∈ C {\displaystyle s\in {\mathcal {C}}} , x s = 0 {\displaystyle x_{s}=0} otherwise). The cost of this C {\displaystyle {\mathcal {C}}} equals the cost of x {\displaystyle x} , that is, ∑ s ∈ C c ( s ) = ∑ s ∈ S c ( s ) x s . {\displaystyle \sum _{s\in {\mathcal {C}}}c(s)=\sum _{s\in {\mathcal {S}}}c(s)x_{s}.} In other words, the linear program LP is a relaxation of the given set-cover problem. Since x ∗ {\displaystyle x^{}} has minimum cost among feasible solutions to the LP, the cost of x ∗ {\displaystyle x^{}} is a lower bound on the cost of the optimal set cover. === Randomized rounding step === In step 3, we must convert the minimum-cost fractional set cover x ∗ {\displaystyle x^{}} into a feasible integer solution x ′ {\displaystyle x'} (corresponding to a true set cover). The rounding step should produce an x ′ {\displaystyle x'} that, with positive probability, has cost within a small factor of the cost of x ∗ {\displaystyle x^{}} .Then (since the cost of x ∗ {\displaystyle x^{}} is a lower bound on the cost of the optimal set cover), the cost of x ′ {\displaystyle x'} will be within a small factor of the optimal cost. As a starting point, consider the most natural rounding scheme: For each set s ∈ S {\displaystyle s\in {\mathcal {S}}} in turn, take x s ′ = 1 {\displaystyle x'_{s}=1} with probability min ( 1 , x s ∗ ) {\displaystyle \min(1,x_{s}^{})} , otherwise take x s ′ = 0 {\displaystyle x'_{s}=0} . With this rounding scheme, the expected cost of the chosen sets is at most ∑ s c ( s ) x s ∗ {\displaystyle \sum _{s}c(s)x_{s}^{}} , the cost of the fractional cover. This is good. Unfortunately the coverage is not good. When the variables x s ∗ {\displaystyle x_{s}^{}} are small, the probability that an element e {\displaystyle e} is not covered is about ∏ s ∋ e 1 − x s ∗ ≈ ∏ s ∋ e exp ⁡ ( − x s ∗ ) = exp ⁡ ( − ∑ s ∋ e x s ∗ ) ≈ exp ⁡ ( − 1 ) . {\displaystyle \prod _{s\ni e}1-x_{s}^{}\approx \prod _{s\ni e}\exp(-x_{s}^{})=\exp {\Big (}-\sum _{s\ni e}x_{s}^{}{\Big )}\approx \exp(-1).} So only a constant fraction of the elements will be covered in expectation. To make x ′ {\displaystyle x'} cover every element with high probability, the standard rounding scheme first scales up the rounding probabilities by an appropriate factor λ > 1 {\displaystyle \lambda >1} . Here is the standard rounding scheme: Fix a parameter λ ≥ 1 {\displaystyle \lambda \geq 1} . For each set s ∈ S {\displaystyle s\in {\mathcal {S}}} in turn, take x s ′ = 1 {\displaystyle x'_{s}=1} with probability min ( λ x s ∗ , 1 ) {\displaystyle \min(\lambda x_{s}^{},1)} , otherwise take x s ′ = 0 {\displaystyle x'_{s}=0} . Scaling the probabilities up by λ {\displaystyle \lambda } increases the expected cost by λ {\displaystyle \lambda } , but makes coverage of all elements likely. The idea is to choose λ {\displaystyle \lambda } as small as possible so that all elements are provably covered with non-zero probability. Here is a detailed analysis. ==== Lemma (approximation guarantee for rounding scheme) ==== Fix λ = ln ⁡ ( 2 | U | ) {\displaystyle \lambda =\ln(2|{\mathcal {U}}|)} . With positive probability, the rounding scheme returns a set cover x ′ {\displaystyle x'} of cost at most 2 ln ⁡ ( 2 | U | ) c ⋅ x ∗ {\displaystyle 2\ln(2|{\mathcal {U}}|)c\cdot x^{}} (and thus of cost O ( log ⁡ | U | ) {\displaystyle O(\log |{\mathcal {U}}|)} times the cost of the optimal set cover). (Note: with care the O ( log ⁡ | U | ) {\displaystyle O(\log |{\mathcal {U}}|)} can be reduced to ln ⁡ ( | U | ) + O ( log ⁡ log ⁡ | U | ) {\displaystyle \ln(|{\mathcal {U}}|)+O(\log \log |{\mathcal {U}}|)} .) ==== Proof ==== The output x ′ {\displaystyle x'} of the random rounding scheme has the desired properties as long as none of the following "bad" events occur: the cost c ⋅ x ′ {\displaystyle c\cdot x'} of x ′ {\displaystyle x'} exceeds 2 λ c ⋅ x ∗ {\displaystyle 2\lambda c\cdot x^{}} , or for some element e {\displaystyle e} , x ′ {\displaystyle x'} fails to cover e {\displaystyle e} . The expectation of each x s ′ {\displaystyle x'_{s}} is at most λ x s ∗ {\displaystyle \lambda x_{s

    Read more →
  • UI data binding

    UI data binding

    UI data binding is a software design pattern to simplify development of GUI applications. UI data binding binds UI elements to an application domain model. Most frameworks employ the Observer pattern as the underlying binding mechanism. To work efficiently, UI data binding has to address input validation and data type mapping. A bound control is a widget whose value is tied or bound to a field in a recordset (e.g., a column in a row of a table). Changes made to data within the control are automatically saved to the database when the control's exit event triggers. == Example == == Data binding frameworks and tools == === Delphi === DSharp third-party data binding tool OpenWire Visual Live Binding - third-party visual data binding tool === Java === JFace Data Binding JavaFX Property === .NET === Windows Forms data binding overview WPF data binding overview Avalonia Unity 3D data binding framework (available in modifications for NGUI, iGUI and EZGUI libraries) === JavaScript === Angular AngularJS Backbone.js Ember.js Datum.js knockout.js Meteor, via its Blaze live update engine OpenUI5 React Vue.js

    Read more →
  • Valantic

    Valantic

    Valantic GmbH (stylised as valantic) is an IT service and consulting company headquartered in Munich, Germany. == History == Valantic GmbH was founded in 2012 under the name Dabero Service Group. Until it was renamed Valantic GmbH in 2017, the company merged with IT service providers and consulting firms. These included, among others, Realtime AG, a company for SAP systems. The companies involved in these mergers were also renamed in 2017 and have since used the Valantic brand name. Realtime AG, for example, became Valantic ERP Services AG. During the COVID-19 pandemic and the resulting economic pressures, demand increased for IT service providers, particularly those offering customised software, IT consulting, SAP services, customer experience, cybersecurity, IoT, and digital work environments. In the following years, Valantic expanded by integrating additional companies. In 2021, Valantic expanded into other European countries through the integration of the Dutch company ISM eCompany and the Portuguese consulting firm Abaco. In 2022, the consulting firm C-Clear/Atom Ideas from Belgium joined Valantic. In February 2019, DPE Deutsche Private Equity Management III GmbH (DPE) took over the majority shareholding in Valantic. The founder, Holger von Daniels, and the further management retained a 25% stake. By 2025, DPE had invested €500 million in Valantic. In the following years, Valantic expanded its international locations. In 2023, Valantic incorporated the Danish company Inspari into the group, thereby entering the Scandinavian market. Inspari is a company for Microsoft technologies such as Azure and Power Platform. In the same year, Valantic joined forces with the Aiopsgroup, an international provider of online shopping applications for private and business customers of large companies. The company is based in Bulgaria with additional locations across Eastern Europe and other places. Additionally, the SAP applications division was expanded through the merger with the Spanish company Saptools. As a result, the companies became one of the largest European end-to-end consulting and implementation house for SAP services. By the end of 2023, Valantic had locations in 18 countries. In November 2024, Valantic announced its merger with the Danish digital consultancy Venzo. Through the integration of the company, founded in 2007 and oriented towards Microsoft technologies and digital transformation projects in the areas of automation, artificial intelligence, security, infrastructure and change management, Valantic further expanded its presence in Denmark and the Nordic countries. In July 2025, Valantic announced its merger with Utiligence GmbH, a Mannheim-based consulting firm for SAP technologies. Utiligence works primarily for the energy industry and supports companies in the integration of SAP S/4HANA and the digitalisation of business processes. == Company structure == Valantic is a partnership-based organisation, with partners acting as decision-makers in matters relating to corporate strategy, employee development and acquisitions. Valantic pursues a holacratic approach, promoting an open and self-organised way of working instead of hierarchical structures. By merging with other companies, Valantic is expanding its range of services and tapping into international markets and market shares. The new companies use Valantic's core systems and support processes, but usually retain their original structure. In the 2024 financial year, the company generated revenue of €544 million and employed 3,874 on average. Valantic has over 40 locations internationally. == Services == Valantic GmbH is a consulting firm, software provider and implementation partner. The company offers services in the areas of digital strategy and analytics (business intelligence and data science), customer experience management, SAP services, smart industries (Industry 4.0, supply chain management, and production planning and control processes), and financial services automation. The automation of financial services is aimed at financial service providers and banks. Valantic has been offering services in the field of generative artificial intelligence (GenAI) since 2023. Part of these services involves enabling companies to use GenAI securely and in compliance with regulations in order to make internal work processes more efficient. Its customers include large corporations, several medium-sized companies and DAX-listed companies. == Research == Since 2018, Valantic has published an annual study on the development of the SAP landscape in German-speaking countries. The study examines topics such as the migration to SAP S/4HANA, cloud strategies, technological trends and the use of artificial intelligence in business processes. The 2025 survey of 201 SAP professionals from the DACH region showed, for example, an increase in ongoing and completed S/4HANA migration projects, as well as a further shift towards private-cloud systems. The use of artificial intelligence continued to grow, as did the use of the SAP Business Technology Platform and the Business Data Cloud. In 2025, Valantic, together with the Handelsblatt Research Institute, published the trend study Digital 2030 – The Rise of Applied AI. The study was based on a survey of around 700 executives from companies in Germany, Austria, and Switzerland on the economic effects of current digitalisation trends. According to the study, most respondents consider artificial intelligence, cybersecurity, and cloud computing to hold the greatest strategic importance for business success by 2030. Around 70% of the participating companies stated that they are already achieving measurable business benefits through the use of AI applications, for example in quality control, document management, logistics, or customer service.

    Read more →
  • Hybrid algorithm

    Hybrid algorithm

    A hybrid algorithm is an algorithm that combines two or more other algorithms that solve the same problem, either choosing one based on some characteristic of the data, or switching between them over the course of the algorithm. This is generally done to combine desired features of each, so that the overall algorithm is better than the individual components. "Hybrid algorithm" does not refer to simply combining multiple algorithms to solve a different problem – many algorithms can be considered as combinations of simpler pieces – but only to combining algorithms that solve the same problem, but differ in other characteristics, notably performance. == Examples == In computer science, hybrid algorithms are very common in optimized real-world implementations of recursive algorithms, particularly implementations of divide-and-conquer or decrease-and-conquer algorithms, where the size of the data decreases as one moves deeper in the recursion. In this case, one algorithm is used for the overall approach (on large data), but deep in the recursion, it switches to a different algorithm, which is more efficient on small data. A common example is in sorting algorithms, where the insertion sort, which is inefficient on large data, but very efficient on small data (say, five to ten elements), is used as the final step, after primarily applying another algorithm, such as merge sort or quicksort. Merge sort and quicksort are asymptotically optimal on large data, but the overhead becomes significant if applying them to small data, hence the use of a different algorithm at the end of the recursion. A highly optimized hybrid sorting algorithm is Timsort, which combines merge sort, insertion sort, together with additional logic (including binary search) in the merging logic. A general procedure for a simple hybrid recursive algorithm is short-circuiting the base case, also known as arm's-length recursion. In this case whether the next step will result in the base case is checked before the function call, avoiding an unnecessary function call. For example, in a tree, rather than recursing to a child node and then checking if it is null, checking null before recursing. This is useful for efficiency when the algorithm usually encounters the base case many times, as in many tree algorithms, but is otherwise considered poor style, particularly in academia, due to the added complexity. Another example of hybrid algorithms for performance reasons are introsort and introselect, which combine one algorithm for fast average performance, falling back on another algorithm to ensure (asymptotically) optimal worst-case performance. Introsort begins with a quicksort, but switches to a heap sort if quicksort is not progressing well; analogously introselect begins with quickselect, but switches to median of medians if quickselect is not progressing well. Centralized distributed algorithms can often be considered as hybrid algorithms, consisting of an individual algorithm (run on each distributed processor), and a combining algorithm (run on a centralized distributor) – these correspond respectively to running the entire algorithm on one processor, or running the entire computation on the distributor, combining trivial results (a one-element data set from each processor). A basic example of these algorithms are distribution sorts, particularly used for external sorting, which divide the data into separate subsets, sort the subsets, and then combine the subsets into totally sorted data; examples include bucket sort and flashsort. However, in general distributed algorithms need not be hybrid algorithms, as individual algorithms or combining or communication algorithms may be solving different problems. For example, in models such as MapReduce, the Map and Reduce step solve different problems, and are combined to solve a different, third problem.

    Read more →
  • Chinese speech synthesis

    Chinese speech synthesis

    Chinese speech synthesis is the application of speech synthesis to the Chinese language (usually Standard Chinese). It poses additional difficulties due to Chinese characters frequently having different pronunciations in different contexts and the complex prosody, which is essential to convey the meaning of words, and sometimes the difficulty in obtaining agreement among native speakers concerning what the correct pronunciation is of certain phonemes. == Concatenation (Ekho and KeyTip) == Recordings can be concatenated in any desired combination, but the joins sound forced (as is usual for simple concatenation-based speech synthesis) and this can severely affect prosody; these synthesizers are also inflexible in terms of speed and expression. However, because these synthesizers do not rely on a corpus, there is no noticeable degradation in performance when they are given more unusual or awkward phrases. Ekho is an open source TTS which simply concatenates sampled syllables. It currently supports Cantonese, Mandarin, and experimentally Korean. Some of the Mandarin syllables have been pitched-normalised in Praat. A modified version of these is used in Gradint's "synthesis from partials". cjkware.com used to ship a product called KeyTip Putonghua Reader which worked similarly; it contained 120 Megabytes of sound recordings (GSM-compressed to 40 Megabytes in the evaluation version), comprising 10,000 multi-syllable dictionary words plus single-syllable recordings in 6 different prosodies (4 tones, neutral tone, and an extra third-tone recording for use at the end of a phrase). == Lightweight synthesizers (eSpeak and Yuet) == The lightweight open-source speech project eSpeak, which has its own approach to synthesis, has experimented with Mandarin and Cantonese. eSpeak was used by Google Translate from May 2010 until December 2010. The commercial product "Yuet" is also lightweight (it is intended to be suitable for resource-constrained environments like embedded systems); it was written from scratch in ANSI C starting from 2013. Yuet claims a built-in NLP model that does not require a separate dictionary; the speech synthesised by the engine claims clear word boundaries and emphasis on appropriate words. Communication with its author is required to obtain a copy. Both eSpeak and Yuet can synthesis speech for Cantonese and Mandarin from the same input text, and can output the corresponding romanisation (for Cantonese, Yuet uses Yale and eSpeak uses Jyutping; both use Pinyin for Mandarin). eSpeak does not concern itself with word boundaries when these don't change the question of which syllable should be spoken. == Corpus-based == A "corpus-based" approach can sound very natural in most cases but can err in dealing with unusual phrases if they can't be matched with the corpus. The synthesiser engine is typically very large (hundreds or even thousands of megabytes) due to the size of the corpus. === iFlyTek === Anhui USTC iFlyTek Co., Ltd (iFlyTek) published a W3C paper in which they adapted Speech Synthesis Markup Language to produce a mark-up language called Chinese Speech Synthesis Markup Language (CSSML) which can include additional markup to clarify the pronunciation of characters and to add some prosody information. The amount of data involved is not disclosed by iFlyTek but can be seen from the commercial products that iFlyTek have licensed their technology to; for example, Bider's SpeechPlus is a 1.3 Gigabyte download, 1.2 Gigabytes of which is used for the highly compressed data for a single Chinese voice. iFlyTek's synthesiser can also synthesise mixed Chinese and English text with the same voice (e.g. Chinese sentences containing some English words); they claim their English synthesis to be "average". The iFlyTek corpus appears to be heavily dependent on Chinese characters, and it is not possible to synthesize from pinyin alone. It is sometimes possible by means of CSSML to add pinyin to the characters to disambiguate between multiple possible pronunciations, but this does not always work. === NeoSpeech === There is an online interactive demonstration for NeoSpeech speech synthesis, which accepts Chinese characters and also pinyin if it's enclosed in their proprietary "VTML" markup. === Mac OS === Mac OS had Chinese speech synthesizers available up to version 9. This was removed in 10.0 and reinstated in 10.7 (Lion). === Historical corpus-based synthesizers (no longer available) === A corpus-based approach was taken by Tsinghua University in SinoSonic, with the Harbin dialect voice data taking 800 Megabytes. This was planned to be offered as a download but the link was never activated. Nowadays, only references to it can be found on Internet Archive. Bell Labs' approach, which was demonstrated online in 1997 but subsequently removed, was described in a monograph "Multilingual Text-to-Speech Synthesis: The Bell Labs Approach" (Springer, October 31, 1997, ISBN 978-0-7923-8027-6), and the former employee who was responsible for the project, Chilin Shih (who subsequently worked at the University of Illinois) put some notes about her methods on her website.

    Read more →