AI Headshot Examples

AI Headshot Examples — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Egocentric vision

    Egocentric vision

    Egocentric vision or first-person vision is a sub-field of computer vision that entails analyzing images and videos captured by a wearable camera, which is typically worn on the head or on the chest and naturally approximates the visual field of the camera wearer. Consequently, visual data capture the part of the scene on which the user focuses to carry out the task at hand and offer a valuable perspective to understand the user's activities and their context in a naturalistic setting. The wearable camera looking forwards is often supplemented with a camera looking inward at the user's eye and able to measure a user's eye gaze, which is useful to reveal attention and to better understand the user's activity and intentions. == History == The idea of using a wearable camera to gather visual data from a first-person perspective dates back to the 70s, when Steve Mann invented "Digital Eye Glass", a device that, when worn, causes the human eye itself to effectively become both an electronic camera and a television display. Subsequently, wearable cameras were used for health-related applications in the context of Humanistic Intelligence and Wearable AI. Egocentric vision is best done from the point-of-eye, but may also be done by way of a neck-worn camera when eyeglasses would be in-the-way. This neck-worn variant was popularized by way of the Microsoft SenseCam in 2006 for experimental health research works. The interest of the computer vision community into the egocentric paradigm has been arising slowly entering the 2010s and it is rapidly growing in recent years, boosted by both the impressive advances in the field of wearable technology and by the increasing number of potential applications. The prototypical first-person vision system described by Kanade and Hebert, in 2012 is composed by three basic components: a localization component able to estimate the surrounding, a recognition component able to identify object and people, and an activity recognition component, able to provide information about the current activity of the user. Together, these three components provide a complete situational awareness of the user, which in turn can be used to provide assistance to the user or to the caregiver. Following this idea, the first computational techniques for egocentric analysis focused on hand-related activity recognition and social interaction analysis. Also, given the unconstrained nature of the video and the huge amount of data generated, temporal segmentation and summarization were among the first problems addressed. After almost ten years of egocentric vision (2007–2017), the field is still undergoing diversification. Emerging research topics include: Social saliency estimation Multi-agent egocentric vision systems Privacy preserving techniques and applications Attention-based activity analysis Social interaction analysis Hand pose analysis Ego graphical User Interfaces (EUI) Understanding social dynamics and attention Revisiting robotic vision and machine vision as egocentric sensing Activity forecasting Gaze prediction == Technical challenges == Today's wearable cameras are small and lightweight digital recording devices that can acquire images and videos automatically, without the user intervention, with different resolutions and frame rates, and from a first-person point of view. Therefore, wearable cameras are naturally primed to gather visual information from our everyday interactions since they offer an intimate perspective of the visual field of the camera wearer. Depending on the frame rate, it is common to distinguish between photo-cameras (also called lifelogging cameras) and video-cameras. The former (e.g., Narrative Clip and Microsoft SenseCam), are commonly worn on the chest, and are characterized by a very low frame rate (up to 2fpm) that allows to capture images over a long period of time without the need of recharging the battery. Consequently, they offer considerable potential for inferring knowledge about e.g. behaviour patterns, habits or lifestyle of the user. However, due to the low frame-rate and the free motion of the camera, temporally adjacent images typically present abrupt appearance changes so that motion features cannot be reliably estimated. The latter (e.g., Google Glass, GoPro), are commonly mounted on the head, and capture conventional video (around 35fps) that allows to capture fine temporal details of interactions. Consequently, they offer potential for in-depth analysis of daily or special activities. However, since the camera is moving with the wearer head, it becomes more difficult to estimate the global motion of the wearer and in the case of abrupt movements, the images can result blurred. In both cases, since the camera is worn in a naturalistic setting, visual data present a huge variability in terms of illumination conditions and object appearance. Moreover, the camera wearer is not visible in the image and what he/she is doing has to be inferred from the information in the visual field of the camera, implying that important information about the wearer, such for instance as pose or facial expression estimation, is not available. == Applications == A collection of studies published in a special theme issue of the American Journal of Preventive Medicine has demonstrated the potential of lifelogs captured through wearable cameras from a number of viewpoints. In particular, it has been shown that used as a tool for understanding and tracking lifestyle behaviour, lifelogs would enable the prevention of noncommunicable diseases associated to unhealthy trends and risky profiles (such as obesity and depression). In addition, used as a tool of re-memory cognitive training, lifelogs would enable the prevention of cognitive and functional decline in elderly people. More recently, egocentric cameras have been used to study human and animal cognition, human-human social interaction, human-robot interaction, human expertise in complex tasks. Other applications include navigation/assistive technologies for the blind, monitoring and assistance of industrial workflows, and augmented reality interfaces.

    Read more →
  • Test data management

    Test data management

    Test data management (TDM) is a process in software testing concerned with the creation, preparation, and control of data used for testing software systems. It involves supplying datasets required to execute test cases and verifying system behaviour under defined conditions. Test data management is an integral part of the software development lifecycle (SDLC) and is utilized in both manual and automated testing processes. It is applied in environments that use continuous integration and DevOps practices, where test execution requires consistent and repeatable data conditions. == Overview == Test data management includes the generation, selection, and preparation of data for testing purposes, as well as its distribution across test environments. It also involves controlling data versions and ensuring that datasets correspond to specific test scenarios. In many cases, production data is adapted for testing through techniques such as masking or subsetting to reduce size and remove sensitive content. Test data management ensures that test cases are executed with relevant, consistent, and readily available data. This reduces variability in test results and supports reproducibility across test cycles. == Importance == The role of test data management has expanded with the growth of complex, data-driven systems and regulatory requirements governing data usage. Testing often depends on data that reflects real-world conditions, but direct use of production data may introduce security and privacy risks. As a result, organizations apply methods such as data masking and anonymization to meet compliance requirements, including those set by the California Privacy Rights Act (CPRA) and Europe’s General Data Protection Regulation (GDPR). Inadequate control of test data can lead to incomplete test coverage, unreliable test results, or delays in testing processes due to unavailable or inconsistent datasets. == Techniques and tools == Test data management leverages various techniques for preparing and controlling data used in testing. These include the generation of synthetic data, the extraction of subsets from production datasets, and the modification of data to remove or obscure sensitive information. A key technical requirement in these processes is maintaining referential integrity, or ensuring that relationships between data entities remain consistent across different tables and systems after masking or subsetting. Data virtualization is also used to provide access to datasets without full replication. These methods may be implemented using software tools that automate data preparation, masking, and distribution.

    Read more →
  • Splitwise

    Splitwise

    Splitwise is an online expense-splitting application software accessible via web browser and mobile app. The app facilitates repayments of shared bills by calculating what each person in a group owes. The primary competitor to the app is Venmo, which only operates in the U.S. Splitwise allows users to create groups with friends to determine what each person owes. All expenses and allocations are added to the app, and Splitwise simplifies the transaction history to determine exactly what payments need to be made to whom to settle outstanding balances. Splitwise stores user information via cloud storage. It was developed and is owned by Splitwise Inc., based in Providence, Rhode Island, United States. == History == The app was launched in February 2011 as SplitTheRent, intended to be used for rent splitting, by Ryan Laughlin, Jon Bittner and Marshall Weir. In September 2013, Splitwise was integrated with Venmo to allow users to settle payments via Venmo. In April 2024, Splitwise partnered with Tink, a Visa payment services company, to incorporate a bank transfer feature directly in the Splitwise app. === Financing === In December 2014, the company raised $1.4 million. In October 2016, the company raised $5 million. In April 2021, Splitwise raised $20 million in funding from series A round run by Insight Partners. == Reception == A 2022 opinion piece in The Guardian by London journalist Imogen West-Knights shared the negative effects of exactly splitting bills among friends and family members. West-Knights argued that Splitwise and similar apps can "turn people into those true enemies of all that is fun and joyful in the world: accountants." However, she said the app does work better when used by couples rather than friend groups. Other reviews noted that the app makes people petty. In contrast, an article published by Condé Nast Traveler describes how Splitwise eliminated stress caused by complicated offline bill splitting, saying it "fixed such a pervasive obstacle in group travel." Coverage by The Wall Street Journal lands somewhere in between the two contrasting views, saying Splitwise and similar apps are helpful, but users need to be prepared for difficult money-related conversations that may arise. An etiquette advisor at Debrett's, said, "The less talk you can have about money on any of these occasions, the better." An editor suggested conversations as simple as asking, "We’re splitting this evenly, right?" before a meal.

    Read more →
  • Personal cloud

    Personal cloud

    A personal cloud is a collection of digital content and services that are accessible from any device through the Internet. It is not a tangible entity, but a place that gives users the ability to store, synchronize, stream and share content on a relative core, moving from one platform, screen and location to another. Created on connected services and applications, it reflects and sets consumer expectations for how next-generation computing services will work. The four primary types of personal cloud in use today are: Online cloud, NAS device cloud, server device cloud, and home-made clouds. == Online cloud == The online cloud is sometimes referred to as the public cloud. It is the cloud computing model where online resources like software and data storage are made available over the Internet. Typically, an individual or organization has little control over the ecosystem in which the online cloud is hosted, and the core infrastructure is shared between many individuals and organizations. The data and applications provided by the service provider are logically segregated so that only those authorized are allowed access. == NAS device cloud == A network-attached storage (NAS) device is a computer connected to a network that provides only file-based data storage services to other devices on the network. Although it may technically be possible to run other software on a NAS device, it is not designed to be a general purpose server. Cloud NAS is remote storage that is accessed over the Internet as if it were local. A cloud NAS is often used for backups and archiving. One of the benefits of NAS Cloud is that data in the cloud can be accessed at any time from anywhere. The main drawback, however, is that the speed of the transfer rate is only as fast as the network connection the data is accessed over and can therefore be fairly slow. == Server device cloud == In many ways cloud servers work in the same way as physical servers but the functions they perform can be very different. Typically, the cloud server is an on-premises device that is connected to the Internet and gives users the functions available on the online cloud but with the added benefit and security of the files being in their control on their premises. The server cloud has been historically enterprise-based deployed by businesses needing an in-house cloud. However, there are also in-house options available for individual users. == Home-made clouds == For the more technologically proficient user a common solution for using a personal cloud is to create a home-made cloud system by connecting an external USB hard drive to a Wi-Fi router. This enables both wired and wireless computers to access the USB hard drive and use it for storage or for retrieving files a user needs to share on the network thereby acting like a cloud. Setting up a personal cloud requires a user to have particular skills in technology and network setup. One of the risks associated with improper setup is security, and leaving the files accessible to anyone with technical knowledge. Not every router supports this type of access and modification.

    Read more →
  • Cognitive robotics

    Cognitive robotics

    Cognitive robotics or cognitive technology is a subfield of robotics concerned with endowing a robot with intelligent behavior by providing it with a processing architecture that will allow it to learn and reason about how to behave in response to complex goals in a complex world. Cognitive robotics may be considered the engineering branch of embodied cognitive science and embodied embedded cognition, consisting of robotic process automation, artificial intelligence, machine learning, deep learning, optical character recognition, image processing, process mining, analytics, software development and system integration. == Core issues == While traditional cognitive modeling approaches have assumed symbolic coding schemes as a means for depicting the world, translating the world into these kinds of symbolic representations has proven to be problematic if not untenable. Perception and action and the notion of symbolic representation are therefore core issues to be addressed in cognitive robotics. == Starting point == Cognitive robotics views human or animal cognition as a starting point for the development of robotic information processing, as opposed to more traditional artificial intelligence techniques. Target robotic cognitive capabilities include perception processing, attention allocation, anticipation, planning, complex motor coordination, reasoning about other agents and perhaps even about their own mental states. Robotic cognition embodies the behavior of intelligent agents in the physical world (or a virtual world, in the case of simulated cognitive robotics). Ultimately, the robot must be able to act in the real world. == Learning techniques == === Motor Babble === A preliminary robot learning technique called motor babbling involves correlating pseudo-random complex motor movements by the robot with resulting visual and/or auditory feedback such that the robot may begin to expect a pattern of sensory feedback given a pattern of motor output. Desired sensory feedback may then be used to inform a motor control signal. This is thought to be analogous to how a baby learns to reach for objects or learns to produce speech sounds. For simpler robot systems, where, for instance, inverse kinematics may feasibly be used to transform anticipated feedback (desired motor result) into motor output, this step may be skipped. === Imitation === Once a robot can coordinate its motors to produce a desired result, the technique of learning by imitation may be used. The robot monitors the performance of another agent and then the robot tries to imitate that agent. It is often a challenge to transform imitation information from a complex scene into a desired motor result for the robot. Note that imitation is a high-level form of cognitive behavior and imitation is not necessarily required in a basic model of embodied animal cognition. === Knowledge acquisition === A more complex learning approach is "autonomous knowledge acquisition": the robot is left to explore the environment on its own. A system of goals and beliefs is typically assumed. A somewhat more directed mode of exploration can be achieved by "curiosity" algorithms, such as Intelligent Adaptive Curiosity or Category-Based Intrinsic Motivation. These algorithms generally involve breaking sensory input into a finite number of categories and assigning some sort of prediction system (such as an artificial neural network) to each. The prediction system keeps track of the error in its predictions over time. Reduction in prediction error is considered learning. The robot then preferentially explores categories in which it is learning (or reducing prediction error) the fastest. == Other architectures == Some researchers in cognitive robotics have tried using architectures such as (ACT-R and Soar (cognitive architecture)) as a basis of their cognitive robotics programs. These highly modular symbol-processing architectures have been used to simulate operator performance and human performance when modeling simplistic and symbolized laboratory data. The idea is to extend these architectures to handle real-world sensory input as that input continuously unfolds through time. What is needed is a way to somehow translate the world into a set of symbols and their relationships. == Questions == Some of the fundamental questions to be answered in cognitive robotics are: How much human programming should or can be involved to support the learning processes? How can one quantify progress? Some of the adopted ways are reward and punishment. But what kind of reward and what kind of punishment? In humans, when teaching a child, for example, the reward would be candy or some encouragement, and the punishment can take many forms. But what is an effective way with robots?

    Read more →
  • Foveated rendering

    Foveated rendering

    Foveated rendering is a rendering technique which uses an eye tracker integrated with a virtual reality headset to reduce the rendering workload by greatly reducing the image quality in the peripheral vision (outside of the zone gazed by the fovea). A less sophisticated variant called fixed foveated rendering doesn't utilise eye tracking and instead assumes a fixed focal point. == History == Research into foveated rendering dates back at least to 1991. At Tech Crunch Disrupt SF 2014, Fove unveiled a headset featuring foveated rendering. This was followed by a successful kickstarter in May 2015. At CES 2016, SensoMotoric Instruments (SMI) demoed a new 250 Hz eye tracking system and a working foveated rendering solution. It resulted from a partnership with camera sensor manufacturer Omnivision who provided the camera hardware for the new system. In July 2016, Nvidia demonstrated during SIGGRAPH a new method of foveated rendering claimed to be invisible to users. In February 2017, Qualcomm announced their Snapdragon 835 Virtual Reality Development Kit (VRDK) which includes foveated rendering support called Adreno Foveation. == Use == According to chief scientist Michael Abrash at Oculus, utilising foveated rendering in conjunction with sparse rendering and deep learning image reconstruction has the potential to require an order of magnitude fewer pixels to be rendered in comparison to a full image. Later, these results have been demonstrated and published. In December 2019, fixed foveated rendering support was added to the Oculus Quest SDK. A number of VR headsets have included on-board eye tracking to provide support for foveated rendering, including HTC's Vive Pro Eye (2019), Meta Quest Pro (2022), PlayStation VR2 (2023), and Apple Vision Pro (2024). In 2025, Valve announced the upcoming Steam Frame headset, which applies a variation of the technique known as "foveated streaming" for wireless streaming from a PC to the headset; the method similarly uses variance in bit rate, and is performed at the encoder level rather than the software level.

    Read more →
  • Tertiary review

    Tertiary review

    In software engineering, a tertiary review is a systematic review of systematic reviews. It is also referred to as a tertiary study in the software engineering literature. However, Umbrella review is the term more commonly used in medicine. Kitchenham et al. suggest that methodologically there is no difference between a systematic review and a tertiary review. However, as the software engineering community has started performing tertiary reviews new concerns unique to tertiary reviews have surfaced. These include the challenge of quality assessment of systematic reviews, search validation and the additional risk of double counting. == Examples of Tertiary reviews in software engineering literature == Test quality Machine Learning Test-driven development

    Read more →
  • MySocialCloud

    MySocialCloud

    MySocialCloud is a cloud-based bookmark vault and password website that allows users to log into all of their online accounts from a single, secure website. The company's investors include Sir Richard Branson, Insight Venture Partners’ Jerry Murdock, and PhotoBucket founder Alex Welch. The company and its founders have been featured in TechCrunch and The Huffington Post. == History == MySocialCloud was co-founded by Scott Ferreira, Stacey Ferreira, and Shiv Prakash in 2011. The idea for a one-stop password storage and login tool came when a computer crash left Scott without documents he used to store access information to his online data. In 2013, the siblings sold MySocialCloud to Reputation.com. == Services == MySocialCloud is cloud-based, and the platform lets users securely store passwords and automatically log into several social media websites simultaneously. The website auto-populates password fields, letting the user log into all of the sites at the push of a button. The service also provides users with security updates for the websites they have included in their profile, and informs users if a website has been hacked. Security played a major role during development of the platform. Passwords stored on the service are salted and hashed with a two-way encryption method known as AES.

    Read more →
  • Learning automaton

    Learning automaton

    A learning automaton is one type of machine learning algorithm studied since 1970s. Learning automata select their current action based on past experiences from the environment. It will fall into the range of reinforcement learning if the environment is stochastic and a Markov decision process (MDP) is used. == History == Research in learning automata can be traced back to the work of Michael Lvovitch Tsetlin in the early 1960s in the Soviet Union. Together with some colleagues, he published a collection of papers on how to use matrices to describe automata functions. Additionally, Tsetlin worked on reasonable and collective automata behaviour, and on automata games. Learning automata were also investigated by researches in the United States in the 1960s. However, the term learning automaton was not used until Narendra and Thathachar introduced it in a survey paper in 1974. == Definition == A learning automaton is an adaptive decision-making unit situated in a random environment that learns the optimal action through repeated interactions with its environment. The actions are chosen according to a specific probability distribution which is updated based on the environment response the automaton obtains by performing a particular action. With respect to the field of reinforcement learning, learning automata are characterized as policy iterators. In contrast to other reinforcement learners, policy iterators directly manipulate the policy π. Another example for policy iterators are evolutionary algorithms. Formally, Narendra and Thathachar define a stochastic automaton to consist of: a set X of possible inputs, a set Φ = { Φ1, ..., Φs } of possible internal states, a set α = { α1, ..., αr } of possible outputs, or actions, with r ≤ s, an initial state probability vector p(0) = ≪ p1(0), ..., ps(0) ≫, a computable function A which after each time step t generates p(t+1) from p(t), the current input, and the current state, and a function G: Φ → α which generates the output at each time step. In their paper, they investigate only stochastic automata with r = s and G being bijective, allowing them to confuse actions and states. The states of such an automaton correspond to the states of a "discrete-state discrete-parameter Markov process". At each time step t=0,1,2,3,..., the automaton reads an input from its environment, updates p(t) to p(t+1) by A, randomly chooses a successor state according to the probabilities p(t+1) and outputs the corresponding action. The automaton's environment, in turn, reads the action and sends the next input to the automaton. Frequently, the input set X = { 0,1 } is used, with 0 and 1 corresponding to a nonpenalty and a penalty response of the environment, respectively; in this case, the automaton should learn to minimize the number of penalty responses, and the feedback loop of automaton and environment is called a "P-model". More generally, a "Q-model" allows an arbitrary finite input set X, and an "S-model" uses the interval [0,1] of real numbers as X. A visualised demo/ Art Work of a single Learning Automaton had been developed by μSystems (microSystems) Research Group at Newcastle University. == Finite action-set learning automata == Finite action-set learning automata (FALA) are a class of learning automata for which the number of possible actions is finite or, in more mathematical terms, for which the size of the action-set is finite.

    Read more →
  • Vote Compass

    Vote Compass

    Vote Compass is an interactive, online voting advice application developed by political scientists and run during election campaigns. It surveys users about their political views and, based on their responses, calculates the individual alignment of each user with the parties or candidates running in a given election contest. It is operated by a social enterprise called Vox Pop Labs in partnership with locale-specific news organizations, including the Wall Street Journal, Vox Media, the Canadian and Australian Broadcasting Corporations, Television New Zealand, France24, RTL Group, and Grupo Globo. Vote Compass also operates under the trademarks Boussole électorale and Wahl-Navi for French- and German-language iterations, respectively. == Background == Vote Compass was developed by Clifton van der Linden, a professor in the Department of Political Science at McMaster University. It is run by van der Linden along with a team of social and statistical scientists from Vox Pop Labs. Although inspired by European Voting Advice Applications, van der Linden explicitly rejects this terminology, arguing that Vote Compass was "never intended to account for every variable that influences voter choice and its results should not be interpreted as voting advice." == Methodology == Using a Likert scale, users indicate their responses to a series of policy propositions designed to discriminate between candidates' policies on prominent issues relevant to the election. Propositions are crafted in collaboration with political scientists local to each jurisdiction in which Vote Compass is run. Based on a candidate or political party's public disclosures (i.e. party manifestos, policy proposals, official websites, speeches, media releases, statements made in the legislature, etc.) they are calibrated on the same propositions and scales as are users. A series of aggregation algorithms calculate the overall distance between the user and the candidates or parties. There have been claims that Vote Compass surveys have the potential to become push polling, if the survey questions posed are poorly designed.

    Read more →
  • List of publications in data science

    List of publications in data science

    This is a list of publications in data science, generally organized by order of use in a data analysis workflow. See the list of publications in statistics for more research-based and fundamental publications; while this list is more applied, business oriented, and cross-disciplinary. General article inclusion criteria are: Papers from notable practitioners or notable professors, either with a Wikipedia page or reference to their notability Common knowledge all data professionals should know, with references validating this claim Highly cited applied statistics and machine learning publications Discussion-facilitating papers on the field of data science as a whole (for example, the Attention Is All You Need paper is arguably a landmark paper that can be added here, but it is specific to generative artificial intelligence, not for all practitioners of data) Some reasons why a particular publication might be regarded as important: Topic creator – A publication that created a new topic Breakthrough – A publication that changed scientific knowledge significantly Influence – A publication which has significantly influenced the world or has had a massive impact on the teaching of data science. When possible, a reference is used to validate the inclusion of the publication in this list. == History == Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) Author: Leo Breiman Publication data: Online version: https://projecteuclid.org/journals/statistical-science/volume-16/issue-3/Statistical-Modeling--The-Two-Cultures-with-comments-and-a/10.1214/ss/1009213726.pdf Description: Describes two cultures of statistics, one using a parsimonious and generative stochastic model, while the other is an algorithmic model with no known mechanism for how the data is generated. Breiman argues that while statistics has traditionally favored using the stochastic model, there is value in expanding the methods that statisticians can use to study phenomenon. Importance: Influence on the philosophies of statisticians right before the increased use of machine learning and deep learning methods. In a 20-year retrospective on this article, "Breiman's words are perhaps more relevant than ever". Notable statisticians at the time wrote opinion pieces about the publication. Although overall critical of the publication, David Cox writes that the publication "contains enough truth and exposes enough weaknesses to be thought-provoking." Bradley Efron commented that this publication is a "stimulating paper". Emanuel Parzen also comments about this publication that "Breiman alerts us to systematic blunders (leading to wrong conclusions) that have been committed applying current statistical practice of data modeling". Data Scientist: The Sexiest Job of the 21st Century Author: Thomas H. Davenport and DJ Patil Publication data: Online version: hbr.org/2022/07/is-data-scientist-still-the-sexiest-job-of-the-21st-century Description: Describes the new role at companies that is coined "Data scientist", what they do, how an organization might recruit one to their organization, and how to work with one effectively. Importance: This publication has been an influence on the data community as mentioned near the time it was published in 2012 by institutions like IEEE Spectrum, but also mentioned nearly a decade later asking the same question the title poses. In a retrospective response to their own publication 10 years earlier, authors Davenport and Patil have reflected that the role of a data scientist has "become better institutionalized, the scope of the job has been redefined, the technology it relies on has made huge strides, and the importance of non-technical expertise, such as ethics and change management, has grown". 50 Years of Data Science Author: David Donoho Publication data: Online version: https://www.tandfonline.com/doi/full/10.1080/10618600.2017.1384734 Description: Retrospective discussion paper on the history and origins of data science, with a number of commentary from notable statisticians. Importance: This has been described as "the first in the field to present such a comprehensive and in-depth survey and overview", and helps to define the field that has many definitions. The Composable Data Management System Manifesto Author: Pedro Pedreira, Orri Erling, Konstantinos Karanasos, Scott Schneider, Wes McKinney, Satya R Valluri, Mohamed Zait, Jacques Nadeau Publication data: Online version: https://www.vldb.org/pvldb/vol16/p2679-pedreira.pdf Description: The vision paper advocating for a paradigm shift in how data management systems are designed using standard, composable, interoperable tools rather than siloed software tools. Importance: A paradigm shifting view on how future data science software tools should be designed for more efficient workflows, the principles of which "will be especially crucial for addressing fragmentation, improving interoperability, and promoting user-centricity as data ecosystems grow increasingly complex". == Data collection and organization == Tidy Data Author: Hadley Wickham Publication data: Online version: https://www.jstatsoft.org/article/view/v059i10/ https://vita.had.co.nz/papers/tidy-data.pdf Description: Describes a framework for data cleaning that is summarized in the quote, "each variable is a column, each observation is a row, and each type of observational unit is a table". This allows a standard data structure for which data analysis tools can be consistently built around. Importance: Cited over 1,500 times, this effort for tidy data has been described by David Donoho as having "more impact on today's practice of data analysis than many highly regarded theoretical statistics articles". In the context of data visualization, this publication is said to support "efficient exploration and prototyping because variables can be assigned different roles in the plot without modifying anything about the original dataset". Data Organization in Spreadsheets Author: Karl W. Broman and Kara H. Woo Publication data: Online version: https://www.tandfonline.com/doi/full/10.1080/00031305.2017.1375989 Description: This article offers practical recommendations for organizing data in spreadsheets, like Microsoft Excel and Google Sheets, to reduce errors and lower the barrier for later analyses due to limitations in spreadsheets or quirks in the software. Importance: Influences teaching both data and non-data practitioners to create more analysis-friendly spreadsheets, and has been described to outline "spreadsheet best practices". == Data visualizations == Quantitative Graphics in Statistics: A Brief History Author: James R. Beniger and Dorothy L. Robyn Publication data: Online version: https://www.jstor.org/stable/2683467 Description: Outlines history and evolution of quantitative graphics in statistics, going through spatial organization (17th and 18th centuries), discrete comparison (18th and 19th centuries), continuous distribution (19th century), and multivariate distribution and correlation (late 19th and 20th centuries). Importance: Helps put into perspective for learning data practitioners the recency of graphics that are used. A later publication "Graphical Methods in Statistics" by Stephen Fienberg in 1979 writes that his publication "owes much to the work of Beniger and Robyn". == Practice == Data Science for Business Author: Foster Provost and Tom Fawcett Publication data: Online version: N/A Description: Broadly outlines principles of data science and data-analytic thinking for businesses. Importance: Cited over 3,000 times, it is "highly recommended for students" but also it is also recommended due to its "relevance to senior management leaders who want to build and lead a team of data scientists and implement data science in solving complex business problems". == Tooling == Hidden Technical Debt in Machine Learning Systems Author: D. Sculley, Gary Holy, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-François Crespo, Dan Dennison Publication data: Online version: https://proceedings.neurips.cc/paper_files/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf Description: This paper argues that it is "dangerous to think of [complex machine learning] quick wins as coming for free" and overviews risk factors to account for when implementing a machine learning system. Importance: All authors worked for Google, article is cited over 2,000 times, and helped practitioners thinking about quickly implementing a machine learning tool without understanding the long-term maintenance of the tool. A few useful things to know about machine learning Author: Pedro Domingos Publication data: Online version: https://dl.acm.org/doi/10.1145/2347736.2347755 https://homes.cs.washington.edu/~pedrod/papers/cacm12.pdf Description: The purpose of this paper is to distill inaccessible "folk knowledge" to effectively implement machine learning projects because "machin

    Read more →
  • Software engineering demographics

    Software engineering demographics

    Software engineers make up a significant portion of the global workforce. As of 2022, there are an estimated 26.9 million professional software engineers worldwide, up from 21 million in 2016. == By country == === United States === In 2023, there were an estimated 1.6 million professional software developers in North America. There are 166 million people employed in the US workforce, making software developers 0.96% of the total workforce. ==== Summary ==== ==== Software engineers vs. traditional engineers ==== The following two tables compare the number of software engineers (611,900 in 2002) versus the number of traditional engineers (1,157,020 in 2002). There are another 1,500,000 people in system analysis, system administration, and computer support, many of whom might be called software engineers. Many systems analysts manage software development teams, and as analysis is an important software engineering role, many of them may be considered software engineers in the near future. This means that the number of software engineers may actually be much higher. It is important to note that the number of software engineers declined by 5 to 10 percent from 2000 to 2002. ==== Computer managers vs. construction and engineering managers ==== Computer and information system managers (264,790) manage software projects, as well as computer operations. Similarly, Construction and engineering managers (413,750) oversee engineering projects, manufacturing plants, and construction sites. Computer management is 64% the size of construction and engineering management. ==== Software engineering educators vs. engineering educators ==== Most people working in the field of computer science, whether making software systems (software engineering) or studying the theoretical and mathematical facts of software systems (computer science), acquire degrees in computer science. According to the U.S. Bureau of Labor Statistics (May 2023 data), there were approximately 44,800 postsecondary computer science teachers and 50,300 engineering teachers, indicating that the computer science educator workforce is nearly 89% as large as that of engineering educators. The combined number of postsecondary chemistry (25,400) and physics (17,100) teachers totaled 42,500, slightly less than the number of computer science educators. ==== Other software and engineering roles ==== ==== Relation to IT demographics ==== Software engineers are part of the much larger software, hardware, application, and operations community. In 2000 in the U.S., there were about 680,000 software engineers and about 10,000,000 IT workers. As of early 2025, there are an estimated 47.2 million software developers worldwide, representing a 50% increase from 31 million in Q1 2022. There are no numbers on testers in the BLS data. === India === There has been a healthy growth in the number of India's IT professionals over the past few years. From a base of 6,800 knowledge workers in 1985–86, the number increased to 522,000 software and services professionals by the end of 2001–02. It is estimated that out of these 528,000 knowledge workers, almost 170,000 are working in the IT software and services export industry; nearly 106,000 are working in the IT enabled services and over 230,000 in user organizations. === Australia === In May 2024, the Australian government reported that 169,300 Australians are employed as software and applications programmers, 17% of who are women. The role grew annually by 8,300 workers. === Russia === According to the Russian government, the number of IT specialists in the country increased by 13% in 2023, reaching approximately 857,000. During the initial phase of the 2022 invasion of Ukraine, an estimated 100,000 IT specialists left Russia.

    Read more →
  • Workplace impact of artificial intelligence

    Workplace impact of artificial intelligence

    The impact of artificial intelligence on workers includes both applications to improve worker safety and health, and potential hazards that must be controlled. One potential application is using AI to eliminate hazards by removing humans from hazardous situations that involve risk of stress, overwork, or musculoskeletal injuries. Predictive analytics may also be used to identify conditions that may lead to hazards such as fatigue, repetitive strain injuries, or toxic substance exposure, leading to earlier interventions. Another is to streamline workplace safety and health workflows through automating repetitive tasks, enhancing safety training programs through virtual reality, or detecting and reporting near misses. When used in the workplace, AI also presents the possibility of new hazards. These may arise from machine learning techniques leading to unpredictable behavior and inscrutability in their decision-making, or from cybersecurity and information privacy issues. Many hazards of AI are psychosocial due to its potential to cause changes in work organization. These include increased monitoring leading to micromanagement, algorithms unintentionally or intentionally mimicking undesirable human biases, and assigning blame for machine errors to the human operator instead. AI may also lead to physical hazards in the form of human–robot collisions, and ergonomic risks of control interfaces and human–machine interactions. Hazard controls include cybersecurity and information privacy measures, communication and transparency with workers about data usage, and limitations on collaborative robots. From a workplace safety and health perspective, only "weak" or "narrow" AI that is tailored to a specific task is relevant, as there are many examples that are currently in use or expected to come into use in the near future. Certain digital technologies are predicted to result in job losses. Starting in the 2020s, the adoption of modern robotics has led to net employment growth. However, many businesses anticipate that automation, or employing robots would result in job losses in the future. This is especially true for companies in Central and Eastern Europe. Other digital technologies, such as platforms or big data, are projected to have a more neutral impact on employment. A large number of tech workers have been laid off starting in 2023; many such job cuts have been attributed to artificial intelligence. == Health and safety applications == In order for any potential AI health and safety application to be adopted, it requires acceptance by both managers and workers. For example, worker acceptance may be diminished by concerns about information privacy, or from a lack of trust and acceptance of the new technology, which may arise from inadequate transparency or training. Alternatively, managers may emphasize increases in economic productivity rather than gains in worker safety and health when implementing AI-based systems. === Eliminating hazardous tasks === AI may increase the scope of work tasks where a worker can be removed from a situation that carries risk. In a sense, while traditional automation can replace the functions of a worker's body with a robot, AI effectively replaces the functions of their brain with a computer. Hazards that can be avoided include stress, overwork, musculoskeletal injuries, and boredom. This can expand the range of affected job sectors into white-collar and service sector jobs such as in medicine, finance, and information technology. === Analytics to reduce risk === Machine learning is used for people analytics to make predictions about worker behavior to assist management decision-making, such as hiring and performance assessment. These could also be used to improve worker health. The analytics may be based on inputs such as online activities, monitoring of communications, location tracking, and voice analysis and body language analysis of filmed interviews. For example, sentiment analysis may be used to spot fatigue to prevent overwork. Decision support systems have a similar ability to be used to, for example, prevent industrial disasters or make disaster response more efficient. For manual material handling workers, predictive analytics and artificial intelligence may be used to reduce musculoskeletal injury. Traditional guidelines are based on statistical averages and are geared towards anthropometrically typical humans. The analysis of large amounts of data from wearable sensors may allow real-time, personalized calculation of ergonomic risk and fatigue management, as well as better analysis of the risk associated with specific job roles. Wearable sensors may also enable earlier intervention against exposure to toxic substances than is possible with area or breathing zone testing on a periodic basis. Furthermore, the large data sets generated could improve workplace health surveillance, risk assessment, and research. === Streamlining safety and health workflows === AI has also been used to attempt to make the workplace safety and health workflow more efficient. One example is coding of workers' compensation claims, which are submitted in a prose narrative form and must manually be assigned standardized codes. AI is being investigated to perform this task faster, more cheaply, and with fewer errors. == Hazards == There are several broad aspects of AI that may give rise to specific hazards. The risks depend on implementation rather than the mere presence of AI. Systems using sub-symbolic AI such as machine learning may behave unpredictably and are more prone to inscrutability in their decision-making. This is especially true if a situation is encountered that was not part of the AI's training dataset, and is exacerbated in environments that are less structured. Undesired behavior may also arise from flaws in the system's perception (arising either from within the software or from sensor degradation), knowledge representation and reasoning, or from software bugs. They may arise from improper training, such as a user applying the same algorithm to two problems that do not have the same requirements. Machine learning applied during the design phase may have different implications than that applied at runtime. Systems using symbolic AI are less prone to unpredictable behavior. The use of AI also increases cybersecurity risks relative to platforms that do not use AI, and information privacy concerns about collected data may pose a hazard to workers. === Psychosocial === Psychosocial hazards are those that arise from the way work is designed, organized, and managed, or its economic and social contexts, rather than arising from a physical substance or object. They cause not only psychiatric and psychological outcomes such as occupational burnout, anxiety disorders, and depression, but they can also cause physical injury or illness such as cardiovascular disease or musculoskeletal injury. Many hazards of AI are psychosocial in nature due to its potential to cause changes in work organization, in terms of increasing complexity and interaction between different organizational factors. However, psychosocial risks are often overlooked by designers of advanced manufacturing systems. Einola and Khoreva explore how different organizational groups perceive and interact with AI technologies. Their research shows that successful AI integration depends on human ownership and contextual understanding. They caution against blind technological optimism and stress the importance of tailoring AI use to specific workplace ecosystems. This perspective reinforces the need for inclusive design and transparent implementation strategies. ==== Changes in work practices ==== Over-reliance on AI tools may lead to deskilling of some professions. When AI becomes a substitute for traditional peer collaboration and mentorship, there is a risk of diminishing opportunities for interpersonal skill development and team-based learning. Increased monitoring may lead to micromanagement and thus to stress and anxiety. A perception of surveillance may also lead to stress. Controls for these include consultation with worker groups, extensive testing, and attention to introduced bias. Wearable sensors, activity trackers, and augmented reality may also lead to stress from micromanagement, both for assembly line workers and gig workers. Gig workers also lack the legal protections and rights of formal workers. Newell & Marabelli argue that AI alters power dynamics and employee autonomy, requiring a more nuanced understanding of its social and organizational implications. There is also the risk of people being forced to work at a robot's pace, or to monitor robot performance at nonstandard hours. A 2025 preprint paper based on users' interactions with the AI chatbot Microsoft Copilot identified forty jobs that the author's claimed had high overlaps with the capabilities of AI. Some media outlets used this paper to report on jobs becoming obsolete. Cri

    Read more →
  • Amaryllo

    Amaryllo

    Amaryllo Inc. is a multinational company founded in Amsterdam, the Netherlands, and now headquartered in the United States. It operates as a cloud service platform, providing cloud storage and cloud computing solutions to enterprises and brand companies. Amaryllo began with Skype IP camera development, pioneering biometric robotic technologies, encrypted P2P network, and secure cloud storage. Amaryllo was founded by Band of Angels member, Marcus Yang to develop patents for a new type of robotic cameras that is claimed to "talk, hear, sense, recognize human faces, and track intruders". It also claims to have made the world's first security robot based on the WebRTC protocol, Icam PRO FHD, and won the 2015 CES Best of Innovation Award under Embedded Technology category. Its home security robots claim to employ 256-bit encryption and run on the WebRTC protocol. Amaryllo products are sold in over 100 Countries across 6 Continents. == History == Amaryllo revealed its first smart home security products at Internationale Funkausstellung Berlin (IFA) 2013 with a Skype-enabled IP camera called iCam HD. Amaryllo announced its second Skype-certified smart home product, iBabi HD, at CES 2014. The company was chosen as a "Cool Vendor" by Gartner in Connected Home 2014. Amaryllo introduced WebRTC-based smart home products after Microsoft terminated embedded Skype services in mid 2014. Since then, Amaryllo has been developing camera robots with auto-tracking and facial recognition technologies. Its camera robots, ATOM AR3 and ATOM AR3S, were introduced in late 2016. It focuses on wired and wireless technology based on AI services. == Cloud Service Platform == Amaryllo offers prepaid cloud storage through digital codes and gift cards, distributed via InComm Payments, Blackhawk Network, and other partners. It provides high-performance cloud computing service through Rescale partnership. Amaryllo provides free cameras under an annual cloud storage subscription on its website. == Global Supercomputing Network (GSN) == The Global Supercomputing Network (GSN) is a distributed high-performance computing (HPC) platform developed by Amaryllo. The network is designed to provide scalable Infrastructure as a Service (IaaS) by connecting a global array of data centers to offer GPU computing resources for specialized industrial and scientific applications. === Architecture and Technology === GSN operates as a decentralized distributed network of servers rather than a single centralized supercomputer. The platform integrates an artificial intelligence assistant named Genie, also developed by Amaryllo. Genie's primary function is to manage computing allocation, helping users identify and connect to available resources across the network’s various nodes based on the specific requirements of their tasks. === Services === The network primarily focuses on the rental of GPU processing resources, catering to fields that require massive parallel processing capabilities, including: Artificial Intelligence and Machine Learning: Training large language models (LLMs) and neural networks. Scientific Simulations: Executing complex calculations in physics, chemistry, and bioinformatics. Data Analytics: Processing large-scale datasets. By utilizing a rental model, GSN allows organizations to access high-end hardware without the capital expenditure associated with purchasing and maintaining physical server infrastructure. === Infrastructure and Partnerships === The network’s physical footprint is expanded through strategic partnerships with data center operators. GSN collaborates with MettaDC and Cyber DC to provide colocation services. These partnerships facilitate the deployment of Nvidia server clusters within secure, Tier-rated facilities, ensuring high availability and connectivity for GSN users. == Official Brand Licensee of HP == Amaryllo Inc. is an official licensee of HP Inc., managing both B2B and B2C cloud services under the HP brand. Through this partnership, Amaryllo offers a range of secure and scalable cloud solutions, including HP Cloud, which provides subscription and one-time payment storage for reliable data backup and storage for individuals, families, and businesses. HP Cloud employs cloud computing technologies to create smart albums for users.

    Read more →
  • Tiki Wiki CMS Groupware

    Tiki Wiki CMS Groupware

    Tiki Wiki CMS Groupware or simply Tiki, originally known as TikiWiki, is a free and open source Wiki-based content management system and online office suite written primarily in PHP and distributed under the GNU Lesser General Public License (LGPL-2.1-only) license. In addition to enabling websites and portals on the internet and on intranets and extranets, Tiki contains a number of collaboration features allowing it to operate as a Geospatial Content Management System (GeoCMS) and Groupware web application. Tiki includes all the basic features common to most CMSs such as the ability to register and maintain individual user accounts within a flexible and rich permission / privilege system, create and manage menus, RSS-feeds, customize page layout, perform logging, and administer the system. All administration tasks are accomplished through a browser-based user interface. Tiki features an all-in-one design, as opposed to a core+extensions model followed by other CMSs. This allows for future-proof upgrades (since all features are released together), but has the drawback of an extremely large codebase (more than 1,000,000 lines). Tiki can run on any computing platform that supports both a web server capable of running PHP 5 (including Apache HTTP Server, IIS, Lighttpd, Hiawatha, Cherokee, and nginx) and a MySQL/MariaDB database to store content and settings. == Major components == Tiki has four major categories of components: content creation and management tools, content organization tools and navigation aids, communication tools, and configuration and administration tools. These components enable administrators and users to create and manage content, as well as letting them communicate to others and configure sites. In addition, Tiki allows each user to choose from various visual themes. These themes are implemented using CSS and the open source Smarty template engine. Additional themes can be created by a Tiki administrator for branding or customization as well. == Internationalization == Tiki is an international project, supporting many languages. The default interface language in Tiki is English, but any language that can be encoded and displayed using the UTF-8 encoding can be supported. Translated strings can be included via an external language file, or by translating interface strings directly, through the database. As of 29 September 2005, Tiki had been fully translated into eight languages and reportedly 90% or more translated into another five languages, as well as partial translations for nine additional languages. Tiki also supports interactive translation of actual wiki pages and was the initial wiki engine used in the Cross Lingual Wiki Engine Project. This allows Tiki-based web sites to have translated content — not just the user interface. == Implementation == Tiki is developed primarily in PHP with some JavaScript code. It uses MySQL/MariaDB as a database. It will run on any server that provides PHP 5, including Apache and Microsoft's IIS. Tiki components make extensive use of other open source projects, including Zend Framework, Smarty, jQuery, HTML Purifier, FCKeditor, Raphaël, phpCAS, and Morcego. When used with Mapserver Tiki can become a Geospatial Content Management System. == Project team == Tiki is under active development by a large international community of over 300 developers and translators, and is one of the largest open-source teams in the world. Project members have donated the resources and bandwidth required to host the tiki.org website and various subdomains. The project members refer to this dependence on their own product as "eating their own dogfood", which they have been doing since the early days of the project. Tiki community members also participate in various related events such as WikiSym and the Libre Software Meeting. == History == Tiki has been hosted on SourceForge.net since its initial release (Release 0.9, named Spica) in October 2002. It was primarily the development of Luis Argerich (Buenos Aires, Argentina), Eduardo Polidor (São Paulo, Brazil), and Garland Foster (Green Bay, WI, United States). In July 2003, Tiki was named the SourceForge.net July 2003 Project of the Month. In late 2003, a fork of Tiki was used to create Bitweaver. In 2006, Tiki was named to CMS Report's Top 30 Web Applications. In 2008, Tiki was named to EContent magazine's Top 100 In 2009, Tiki adopted a six-month release cycle and announced the selection of a Long Term Support (LTS) version and the Tiki Software Community Association was formed as the legal steward for Tiki. The Tiki Software Association is a not-for-profit entity established in Canada. Previously, the entire project was run entirely by volunteers. In 2010, Tiki received Best of Open Source Software Applications Award (BOSSIE) from InfoWorld, in the Applications category. In 2011, Tiki was named to CMS Report's Top 30 Web Applications. In 2012, Tiki was named "Best Web Tool" by WebHostingSearch.com, and "People's Choice: Best Free CMS" by CMS Critic. In 2016, Tiki was named as one of the "10 Best Open Source Collaboration Software Tools" by Small Business Computing. == Name == The name TikiWiki is written in CamelCase, a common Wiki syntax indicating a hyperlink within the Wiki. It is most likely a compound word combining two Polynesian terms, Tiki and Wiki, to create a self-rhyming name similar to wikiwiki, a common variant of wiki. A backronym has also been formed for Tiki: Tightly Integrated Knowledge Infrastructure. == Release Information and History == In general, the Tiki Software Community Association releases a new major version of Tiki Wiki every 8 months where prior, non-LTS, major versions are supported until the first minor version release of the next major version (i.e., 16.0 ⇒ 17.1). Starting with version 12.x, Tiki Wiki LTS is supported for 5 years where it enters a security/maintenance release cycle upon the release of the next LTS version. Tiki Wiki's release history is outlined below.

    Read more →