AI For Business Guide

AI For Business Guide — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Conservative morphological anti-aliasing

    Conservative morphological anti-aliasing

    Conservative morphological anti-aliasing (CMAA) is an antialiasing technique originally developed by Filip Strugar at Intel. CMAA is an image-based, post processing technique similar to that of morphological antialiasing. CMAA uses 4 main steps which are image analysis for color discontinuities, locally dominant edge detection, simple shape handling, and lastly symmetrical long edge shape handling. A couple of years after CMAA was introduced, Intel unveiled an updated version which they named CMAA2.

    Read more →
  • Color balance

    Color balance

    In photography and image processing, color balance is the global adjustment of the intensities of the colors (typically red, green, and blue primary colors). An important goal of this adjustment is to render specific colors – particularly neutral colors like white or grey – correctly. Hence, the general method is sometimes called gray balance, neutral balance, or white balance. Color balance changes the overall mixture of colors in an image and is used for color correction. Generalized versions of color balance are used to correct colors other than neutrals or to deliberately change them for effect. White balance is one of the most common kinds of balancing, and is when colors are adjusted to make a white object (such as a piece of paper or a wall) appear white and not a shade of any other colour. Image data acquired by sensors – either film or electronic image sensors – must be transformed from the acquired values to new values that are appropriate for color reproduction or display. Several aspects of the acquisition and display process make such color correction essential – including that the acquisition sensors do not match the sensors in the human eye, that the properties of the display medium must be accounted for, and that the ambient viewing conditions of the acquisition differ from the display viewing conditions. The color balance operations in popular image editing applications usually operate directly on the red, green, and blue channel pixel values, without respect to any color sensing or reproduction model. In film photography, color balance is typically achieved by using color correction filters over the lights or on the camera lens. == Generalized color balance == Sometimes the adjustment to keep neutrals neutral is called white balance, and the phrase color balance refers to the adjustment that in addition makes other colors in a displayed image appear to have the same general appearance as the colors in an original scene. It is particularly important that neutral (gray, neutral, white) colors in a scene appear neutral in the reproduction. === Psychological color balance === Humans relate to flesh tones more critically than other colors. Trees, grass and sky can all be off without concern, but if human flesh tones are 'off' then the human subject can look sick or dead. To address this critical color balance issue, the tri-color primaries themselves are formulated to not balance as a true neutral color. The purpose of this color primary imbalance is to more faithfully reproduce the flesh tones through the entire brightness range. == Illuminant estimation and adaptation == Most digital cameras have means to select color correction based on the type of scene lighting, using either manual lighting selection, automatic white balance, or custom white balance. The algorithms for these processes perform generalized chromatic adaptation. Many methods exist for color balancing. Setting a button on a camera is a way for the user to indicate to the processor the nature of the scene lighting. Another option on some cameras is a button which one may press when the camera is pointed at a gray card or other neutral colored object. This captures an image of the ambient light, which enables a digital camera to set the correct color balance for that light. There is a large literature on how one might estimate the ambient lighting from the camera data and then use this information to transform the image data. A variety of algorithms have been proposed, and the quality of these has been debated. A few examples and examination of the references therein will lead the reader to many others. Examples are Retinex, an artificial neural network or a Bayesian method. == Chromatic colors == Color balancing an image affects not only the neutrals, but other colors as well. An image that is not color balanced is said to have a color cast, as everything in the image appears to have been shifted towards one color. Color balancing may be thought in terms of removing this color cast. Color balance is also related to color constancy. Algorithms and techniques used to attain color constancy are frequently used for color balancing, as well. Color constancy is, in turn, related to chromatic adaptation. Conceptually, color balancing consists of two steps: first, determining the illuminant under which an image was captured; and second, scaling the components (e.g., R, G, and B) of the image or otherwise transforming the components so they conform to the viewing illuminant. Viggiano found that white balancing in the camera's native RGB color model tended to produce less color inconstancy (i.e., less distortion of the colors) than in monitor RGB for over 4000 hypothetical sets of camera sensitivities. This difference typically amounted to a factor of more than two in favor of camera RGB. This means that it is advantageous to get color balance right at the time an image is captured, rather than edit later on a monitor. If one must color balance later, balancing the raw image data will tend to produce less distortion of chromatic colors than balancing in monitor RGB. == Mathematics of color balance == Color balancing is sometimes performed on a three-component image (e.g., RGB) using a 3x3 matrix. This type of transformation is appropriate if the image was captured using the wrong white balance setting on a digital camera, or through a color filter. Changing the color balance of an image can improve classifier results on a trained ML model. === Scaling monitor R, G, and B === In principle, one wants to scale all relative luminances in an image so that objects which are believed to be neutral appear so. If, say, a surface with R = 240 {\displaystyle R=240} was believed to be a white object, and if 255 is the count which corresponds to white, one could multiply all red values by 255/240. Doing analogously for green and blue would result, at least in theory, in a color balanced image. In this type of transformation the 3x3 matrix is a diagonal matrix. [ R G B ] = [ 255 / R w ′ 0 0 0 255 / G w ′ 0 0 0 255 / B w ′ ] [ R ′ G ′ B ′ ] {\displaystyle \left[{\begin{array}{c}R\\G\\B\end{array}}\right]=\left[{\begin{array}{ccc}255/R'_{w}&0&0\\0&255/G'_{w}&0\\0&0&255/B'_{w}\end{array}}\right]\left[{\begin{array}{c}R'\\G'\\B'\end{array}}\right]} where R {\displaystyle R} , G {\displaystyle G} , and B {\displaystyle B} are the color balanced red, green, and blue components of a pixel in the image; R ′ {\displaystyle R'} , G ′ {\displaystyle G'} , and B ′ {\displaystyle B'} are the red, green, and blue components of the image before color balancing, and R w ′ {\displaystyle R'_{w}} , G w ′ {\displaystyle G'_{w}} , and B w ′ {\displaystyle B'_{w}} are the red, green, and blue components of a pixel which is believed to be a white surface in the image before color balancing. This is a simple scaling of the red, green, and blue channels, and is why color balance tools in Photoshop have a white eyedropper tool. It has been demonstrated that performing the white balancing in the phosphor set assumed by sRGB tends to produce large errors in chromatic colors, even though it can render the neutral surfaces perfectly neutral. === Scaling X, Y, Z === If the image may be transformed into CIE XYZ tristimulus values, the color balancing may be performed there. This has been termed a "wrong von Kries" transformation. Although it has been demonstrated to offer usually poorer results than balancing in monitor RGB, it is mentioned here as a bridge to other things. Mathematically, one computes: [ X Y Z ] = [ X w / X w ′ 0 0 0 Y w / Y w ′ 0 0 0 Z w / Z w ′ ] [ X ′ Y ′ Z ′ ] {\displaystyle \left[{\begin{array}{c}X\\Y\\Z\end{array}}\right]=\left[{\begin{array}{ccc}X_{w}/X'_{w}&0&0\\0&Y_{w}/Y'_{w}&0\\0&0&Z_{w}/Z'_{w}\end{array}}\right]\left[{\begin{array}{c}X'\\Y'\\Z'\end{array}}\right]} where X {\displaystyle X} , Y {\displaystyle Y} , and Z {\displaystyle Z} are the color-balanced tristimulus values; X w {\displaystyle X_{w}} , Y w {\displaystyle Y_{w}} , and Z w {\displaystyle Z_{w}} are the tristimulus values of the viewing illuminant (the white point to which the image is being transformed to conform to); X w ′ {\displaystyle X'_{w}} , Y w ′ {\displaystyle Y'_{w}} , and Z w ′ {\displaystyle Z'_{w}} are the tristimulus values of an object believed to be white in the un-color-balanced image, and X ′ {\displaystyle X'} , Y ′ {\displaystyle Y'} , and Z ′ {\displaystyle Z'} are the tristimulus values of a pixel in the un-color-balanced image. If the tristimulus values of the monitor primaries are in a matrix P {\displaystyle \mathbf {P} } so that: [ X Y Z ] = P [ L R L G L B ] {\displaystyle \left[{\begin{array}{c}X\\Y\\Z\end{array}}\right]=\mathbf {P} \left[{\begin{array}{c}L_{R}\\L_{G}\\L_{B}\end{array}}\right]} where L R {\displaystyle L_{R}} , L G {\displaystyle L_{G}} , and L B {\displaystyle L_{B}} are the un-gamma corrected monitor RGB, one may use: [ L R L G L B ] = P − 1 [ X w / X w ′ 0 0

    Read more →
  • Summify

    Summify

    Summify was a social news aggregator founded by Mircea Paşoi and Cristian Strat, two former Google and Microsoft interns from Romania. The service emailed its users a periodic summary of news articles shared from their social networks based on their relevance and importance. The platform supported Twitter, Facebook, and Google Reader accounts. == History == In 2009, Paşoi and Strat created ReadFu, a plugin that provided a contextual summary and statistics of the target page of a hyperlink. In January 2010, ReadFu was accepted into the Vancouver-based start-up incubator Bootup Labs. On March 20, 2010 the service was renamed to Summify and a private beta began. On August 11, 2010 Paşoi and Strat announced a new direction for the service. It would become a real-time social news reader that aggregates incoming news from social networks and displays articles by importance using social reactions. After some feedback that the users preferred article digests by email more than the real-time news reader version, Summify discontinued the news reader version. In March 2011, Summify completed a Seed round, with investors including Rob Glaser, Accel Partners, and Stewart Butterfield. Summify received coverage from various news and media outlets such as TechCrunch. It was also featured in various news platforms, such as Time, The Globe and Mail, Mashable, VentureBeat, Gizmodo, Lifehacker, and The Next Web. Summify released a free app on the Apple App Store on July 8, 2011. The app allowed users to read their web summaries from iOS mobile devices. Summify was acquired by Twitter on January 19, 2012. The service shut down soon after, on June 22, 2012.

    Read more →
  • Windows Live OneCare Safety Scanner

    Windows Live OneCare Safety Scanner

    Windows Live OneCare Safety Scanner (formerly Windows Live Safety Center and codenamed Vegas) was an online scanning, PC cleanup, and diagnosis service to help remove of viruses, spyware/adware, and other malware. It was a free web service that was part of Windows Live. On November 18, 2008, Microsoft announced the discontinuation of Windows Live OneCare, offering users a new free anti-malware suite Microsoft Security Essentials, which had been available since the second half of 2009. However, Windows Live OneCare Safety Scanner, under the same branding as Windows Live OneCare, was not discontinued during that time. The service was officially discontinued on April 15, 2011 and replaced with Microsoft Safety Scanner. == Overview == Windows Live OneCare Safety Scanner offered a free online scanning and protection from threats. The Windows Live OneCare Safety Scanner must be downloaded and installed to your computer to scan your computer. The "Full Service Scan" looks for common PC health issues such as viruses, temporary files, and open network ports. It searches and removes viruses, improves a computer's performance, and removes unnecessary clutter on the PC's hard disk. The user can choose between a "Full Scan" (which can be customized) or a "Quick Scan". The "Full Scan" scans for viruses (comprehensive scan or quick scan), hard disk performance (Disk fragmentation scan and/or Desk cleanup scan) and network safety (open port scan). The "Quick Scan" only scans for viruses, only on specific areas on the computer. The quick scan is faster than the full scan, hence that appellation. The service also provides a virus database, information about online threats, and general computer security documentation and tools. == Limits == The virus scanner on the Windows Live OneCare Safety Scanner site runs a scan of the user's computer only when the site is visited. It does not run periodic scans of the system, and does not provide features to prevent viruses from infecting the computer at the time, or thereafter. It simply resolves detected infections. Many users who have posted on the Product Feedback forum report script errors relating to Internet Explorer 7 (besides IE being the only browser supported by this service). The OneCare safety scanner team have been actively solving these problems, many of them registry-related.

    Read more →
  • Connected-component labeling

    Connected-component labeling

    Connected-component labeling (CCL), connected-component analysis (CCA), blob extraction, region labeling, blob discovery, or region extraction is an algorithmic application of graph theory, where subsets of connected components are uniquely labeled based on a given heuristic. Connected-component labeling is not to be confused with segmentation. Connected-component labeling is used in computer vision to detect connected regions in binary digital images, although color images and data with higher dimensionality can also be processed. When integrated into an image recognition system or human-computer interaction interface, connected component labeling can operate on a variety of information. Blob extraction is generally performed on the resulting binary image from a thresholding step, but it can be applicable to gray-scale and color images as well. Blobs may be counted, filtered, and tracked. Blob extraction is related to but distinct from blob detection. == Overview == A graph, containing vertices and connecting edges, is constructed from relevant input data. The vertices contain information required by the comparison heuristic, while the edges indicate connected 'neighbors'. An algorithm traverses the graph, labeling the vertices based on the connectivity and relative values of their neighbors. Connectivity is determined by the medium; image graphs, for example, can be 4-connected neighborhood or 8-connected neighborhood. Following the labeling stage, the graph may be partitioned into subsets, after which the original information can be recovered and processed . == Definition == The usage of the term connected-component labeling (CCL) and its definition is quite consistent in the academic literature, whereas connected-component analysis (CCA) varies both in terminology and in its definition of the problem. Rosenfeld et al. define connected components labeling as the “[c]reation of a labeled image in which the positions associated with the same connected component of the binary input image have a unique label.” Shapiro et al. define CCL as an operator whose “input is a binary image and [...] output is a symbolic image in which the label assigned to each pixel is an integer uniquely identifying the connected component to which that pixel belongs.” There is no consensus on the definition of CCA in the academic literature. It is often used interchangeably with CCL. A more extensive definition is given by Shapiro et al.: “Connected component analysis consists of connected component labeling of the black pixels followed by property measurement of the component regions and decision making.” The definition for connected-component analysis presented here is more general, taking the thoughts expressed in into account. == Algorithms == The algorithms discussed can be generalised to arbitrary dimensions, albeit with increased time and space complexity. === One component at a time === This is a fast and very simple method to implement and understand. It is based on graph traversal methods in graph theory. In short, once the first pixel of a connected component is found, all the connected pixels of that connected component are labelled before going onto the next pixel in the image. This algorithm is part of Vincent and Soille's watershed segmentation algorithm, other implementations also exist. In order to do that a linked list is formed that will keep the indexes of the pixels that are connected to each other, steps (2) and (3) below. The method of defining the linked list specifies the use of a depth or a breadth first search. For this particular application, there is no difference which strategy to use. The simplest kind of a last in first out queue implemented as a singly linked list will result in a depth first search strategy. It is assumed that the input image is a binary image, with pixels being either background or foreground and that the connected components in the foreground pixels are desired. The algorithm steps can be written as: Start from the first pixel in the image. Set current label to 1. Go to (2). If this pixel is a foreground pixel and it is not already labelled, give it the current label and add it as the first element in a queue, then go to (3). If it is a background pixel or it was already labelled, then repeat (2) for the next pixel in the image. Pop out an element from the queue, and look at its neighbours (based on any type of connectivity). If a neighbour is a foreground pixel and is not already labelled, give it the current label and add it to the queue. Repeat (3) until there are no more elements in the queue. Go to (2) for the next pixel in the image and increment current label by 1. Note that the pixels are labelled before being put into the queue. The queue will only keep a pixel to check its neighbours and add them to the queue if necessary. This algorithm only needs to check the neighbours of each foreground pixel once and doesn't check the neighbours of background pixels. The pseudocode is: algorithm OneComponentAtATime(data) input : imageData[xDim][yDim] initialization : label = 0, labelArray[xDim][yDim] = 0, statusArray[xDim][yDim] = false, queue1, queue2; for i = 0 to xDim do for j = 0 to yDim do if imageData[i][j] has not been processed do if imageData[i][j] is a foreground pixel do check its four neighbors(north, south, east, west) : if neighbor is not processed do if neighbor is a foreground pixel do add it to queue1 else update its status to processed end if labelArray[i][j] = label (give label) statusArray[i][j] = true (update status) while queue1 is not empty do For each pixel in the queue do : check its four neighbors if neighbor is not processed do if neighbor is a foreground pixel do add it to queue2 else update its status to processed end if give it the current label update its status to processed remove the current element from queue1 copy queue2 into queue1 end While increase the label end if else update its status to processed end if end if end if end for end for === Two-pass === Relatively simple to implement and understand, the two-pass algorithm, (also known as the Hoshen–Kopelman algorithm) iterates through 2-dimensional binary data. The algorithm makes two passes over the image: the first pass to assign temporary labels and record equivalences, and the second pass to replace each temporary label by the smallest label of its equivalence class. The input data can be modified in situ (which carries the risk of data corruption), or labeling information can be maintained in an additional data structure. Connectivity checks are carried out by checking neighbor pixels' labels (neighbor elements whose labels are not assigned yet are ignored), or say, the north-east, the north, the north-west and the west of the current pixel (assuming 8-connectivity). 4-connectivity uses only north and west neighbors of the current pixel. The following conditions are checked to determine the value of the label to be assigned to the current pixel (4-connectivity is assumed) Conditions to check: Does the pixel to the left (west) have the same value as the current pixel? Yes – We are in the same region. Assign the same label to the current pixel No – Check next condition Do both pixels to the north and west of the current pixel have the same value as the current pixel but not the same label? Yes – We know that the north and west pixels belong to the same region and must be merged. Assign the current pixel the minimum of the north and west labels, and record their equivalence relationship No – Check next condition Does the pixel to the left (west) have a different value and the one to the north the same value as the current pixel? Yes – Assign the label of the north pixel to the current pixel No – Check next condition Do the pixel's north and west neighbors have different pixel values than current pixel? Yes – Create a new label id and assign it to the current pixel The algorithm continues this way, and creates new region labels whenever necessary. The key to a fast algorithm, however, is how this merging is done. This algorithm uses the union-find data structure which provides excellent performance for keeping track of equivalence relationships. Union-find essentially stores labels which correspond to the same blob in a disjoint-set data structure, making it easy to remember the equivalence of two labels by the use of an interface method E.g.: findSet(l). findSet(l) returns the minimum label value that is equivalent to the function argument 'l'. Once the initial labeling and equivalence recording is completed, the second pass merely replaces each pixel label with its equivalent disjoint-set representative element. A faster-scanning algorithm for connected-region extraction is presented below. On the first pass: Iterate through each element of the data by column, then by row (Raster Scanning) If the element is not the background Get the neighboring elements of the current element If there are no neighbors, uniquely

    Read more →
  • FreshBooks

    FreshBooks

    FreshBooks is accounting software operated by 2ndSite Inc. primarily for small and medium-sized businesses. It is a web-based software as a service (SaaS) model, that can be accessed through a desktop or mobile device. The company was founded in 2003 and is based in Toronto, Canada. == History == FreshBooks was founded in 2004 by Mike McDerment, Levi Cooperman, and Joe Sawada in Toronto, Ontario. McDerment incorporated a second company, BillSpring in January 2015 to work on new product development. It was rolled back into FreshBooks as an updated interface in 2016. Initially FreshBooks functioned like an electronic invoicing program targeting IT professionals. After the release of the new interface, the initial release of FreshBooks was referred to as "FreshBooks Classic." FreshBooks Classic was discontinued in 2022 after migrating users to the new platform. FreshBooks Classic's front-end application was built in PHP, and the backend services were built in Python while the new FreshBooks uses the same backend services with a JavaScript single-page application. == Product == FreshBooks is a subscription-based accounting software platform that provides features such as invoicing, accounts payable, expense and time tracking, retainers, fixed asset depreciation, purchase orders, payroll integrations, mileage tracking, double-entry accounting, and standard business reporting. Financial data is stored in the cloud on a unified ledger, enabling access from desktop and mobile devices. The platform includes a free API for integration with external applications and supports multiple tax rates and currencies. It also offers project management and payroll functionalities. Pricing is based on a recurring monthly fee. FreshBooks supports country-specific tax calculations, including GST and HST in Canada, sales taxes in the United States, and MTD compliance in the UK. == Operations == FreshBooks has its headquarters in Toronto, Canada with operations in North America, Europe and Australia. Founder Mike McDerment was the chief executive officer of the company from 2003 until 2021, when he stepped down and was replaced by Don Epperson, but stayed as the executive chair. Don Epperson had previously joined FreshBooks as executive director in 2019. == Funding == FreshBooks was initially self-funded. In 2014, the company raised a Series A venture investment of $30 million led by the venture capital firm Oak Investment Partners, with participation by Georgian Partners and Atlas Venture. In 2017, FreshBooks announced that it raised another $43 million in funding from Accomplice, Georgian Partners and Oak Investment Partners. On August 10, 2021, FreshBooks announced that it had secured $80.75 million in Series E funding and $50 million in debt financing. FreshBooks also reached a valuation of more than $1 billion.

    Read more →
  • Wrike

    Wrike

    Wrike, Inc. is an American project management application service provider based in San Jose, California. Wrike also has offices in India, Dallas, Tallinn, Nicosia, Dublin, Tokyo, Melbourne, and Prague. == History == Wrike was founded in 2006 by Andrew Filev. Currently CEO at Wrike is Thomas Scott. Filev initially self-funded the company before later obtaining investor funding. Wrike released the beta version of its software (also called Wrike) in December 2006. The company then launched a new "Enterprise" platform in December 2013. In June 2015, Wrike announced the opening of an office in Dublin, Ireland and in 2016, Wrike launched a datacenter there to host data in compliance with local privacy regulations. In July 2016, Wrike announced the launch of Wrike for Marketers. That same year, Wrike's headquarters moved from Mountain View to San Jose, California. In January 2021, Citrix Systems announced its intention to acquire Wrike for $2.25 billion. The acquisition closed in March 2021. On January 31, 2022, it was announced that Citrix had been acquired in a $16.5 billion deal by affiliates of Vista Equity Partners and Evergreen Coast Capital. Citrix would merge with TIBCO Software, a Vista portfolio company to form Cloud Software Group (CSG). In September 2022, Wrike separated from Citrix Systems. In July 2023, Vista transferred ownership to Symphony Technology Group. == Investments == Wrike received $1 million in Angel funding in 2012 from TMT Investments. In October, 2013, Wrike secured $10 million in investment funding from Bain Capital. In May 2015, the company secured $15 million in a new round of funding. Investors included Scale Venture Partners, DCM Ventures, and Bain Capital. At that time, Wrike had 8,000 customers, 200 employees, and 30,000 new users each month. On November 29, 2018, Wrike signed a definitive agreement to receive a majority investment by Vista Equity Partners (“Vista”), a firm focused on software, data and technology-enabled businesses. == Software == The Wrike project management software is a Software-as-a-Service (SaaS) product with tools for managing projects, deadlines, schedules, and workflow processes. It includes collaboration features. The application is available in English, French, Spanish, German, Portuguese, Italian, Japanese and Russian. Wrike has triggers for task automation in workflow management. === Features === Wrike features a multi-pane UI and consists of features in two categories: project management, and team collaboration. According to Wrike, project management features are designed to help teams track dates and dependencies associated with projects, manage assignments and resources, and track time. These include an interactive Gantt chart, a workload view, and a sortable table that can be customized to store project data. The software includes a co-editing tool, discussion threads on tasks, and tools for attaching documents, editing them, and tracking their changes. Wrike uses an "inbox" feature and browser notifications to alert users of updates from their colleagues and dashboards for quick overviews of pending tasks. These updates are also available in Wrike's mobile apps on iOS and Android. Wrike has an optional feature set called "Wrike for Marketers" which has several tools for managing marketing workflows. In May 2012, Wrike announced the launch of a freemium version of its software for teams of up to 5 users. That year also saw the integration of a live text coeditor into its workspace to unify collaboration and task management. In late 2013 Wrike released a new feature set called Wrike Enterprise which included advanced analytics and other tools targeted at large business customers. Since then it has released several major updates to Wrike Enterprise, including a customizable spreadsheet called "Dynamic Platform" in late 2014 and custom workflows for teams in 2015. In July 2016, Wrike was updated with a set of add-on features under the name "Wrike for Marketers," which includes integrations with Adobe Photoshop, a tool for submitting requests, and proofing and approval tools for creative assets like videos and images. Wrike is available as native Android and iOS apps. Mobile apps include an interactive Gantt chart that syncs across devices. The apps are available offline, and sync when connection is restored. === Criticism === Critics said new users may have a learning curve with complex features. Wrike has 2,710 customers for an estimated 0.04% market share. Competitors include Google Workspace, Slack (software), and Quip (software).

    Read more →
  • Apache Hama

    Apache Hama

    Apache Hama is a distributed computing framework based on bulk synchronous parallel computing techniques for massive scientific computations e.g., matrix, graph and network algorithms. Originally a sub-project of Hadoop, it became an Apache Software Foundation top level project in 2012. It was created by Edward J. Yoon, who named it (short for "Hadoop Matrix Algebra"), and Hama also means hippopotamus in Yoon's native Korean language (하마), following the trend of naming Apache projects after animals and zoology (such as Apache Pig). Hama was inspired by Google's Pregel large-scale graph computing framework described in 2010. When executing graph algorithms, Hama showed a fifty-fold performance increase relative to Hadoop. Retired in April 2020, project resources are made available as part of the Apache Attic. Yoon cited issues of installation, scalability, and a difficult programming model for its lack of adoption. == Architecture == Hama consists of three major components: BSPMaster, GroomServers and Zookeeper. === BSPMaster === BSPMaster is responsible for: Maintaining groom server status Controlling super steps in a cluster Maintaining job progress information Scheduling jobs and assigning tasks to groom servers Disseminating execution class across groom servers Controlling fault Providing users with the cluster control interface. A BSP Master and multiple grooms are started by the script. Then, the bsp master starts up with a RPC server for groom servers. Groom servers starts up with a BSPPeer instance and a RPC proxy to contact the bsp master. After started, each groom periodically sends a heartbeat message that encloses its groom server status, including maximum task capacity, unused memory, and so on. Each time the BSP master receives a heartbeat message, it brings the groom server status up-to-date. The bsp master makes use of groom servers' status in order to assign tasks to idle groom servers - and returns a heartbeat response containing assigned tasks and others actions for a groom server to do. Currently BSP master has a FIFO job scheduler and simple task assignment algorithms. === GroomServer === A groom server (shortly referred to as groom) is a process that performs BSP tasks assigned by BSPMaster. Each groom contacts the BSPMaster, and it takes assigned tasks and reports its status by means of periodical piggybacks with BSPMaster. Each groom is designed to run with HDFS or other distributed storages. Basically, a groom server and a data node should be run on one physical node. === Zookeeper === A Zookeeper is used to manage the efficient barrier synchronisation of the BSPPeers.

    Read more →
  • Hard sigmoid

    Hard sigmoid

    In artificial intelligence, especially computer vision and artificial neural networks, a hard sigmoid is non-smooth function used in place of a sigmoid function. These retain the basic shape of a sigmoid, rising from 0 to 1, but using simpler functions, especially piecewise linear functions or piecewise constant functions. These are preferred where speed of computation is more important than precision. == Examples == The most extreme examples are the sign function or Heaviside step function, which go from −1 to 1 or 0 to 1 (which to use depends on normalization) at 0. Other examples include the Theano library, which provides two approximations: ultra_fast_sigmoid, which is a multi-part piecewise approximation and hard_sigmoid, which is a 3-part piecewise linear approximation (output 0, line with slope 0.2, output 1).

    Read more →
  • Systems development life cycle

    Systems development life cycle

    The systems development life cycle (SDLC) describes the typical phases and progression between phases during the development of a computer-based system. These phases progress from inception to retirement. At base, there is just one life cycle, but the taxonomy used to describe it may vary; the cycle may be classified into different numbers of phases and various names may be used for those phases. The SDLC is analogous to the life cycle of a living organism from its birth to its death. In particular, the SDLC varies by system in much the same way that each living organism has a unique path through its life. The SDLC does not prescribe how engineers should go about their work to move the system through its life cycle. Prescriptive techniques are referred to using various terms such as methodology, model, framework, and formal process. Other terms are used for the same concept as SDLC, including software development life cycle (also SDLC), application development life cycle (ADLC), and system design life cycle (also SDLC). These other terms focus on a different scope of development and are associated with different prescriptive techniques, but are about the same essential life cycle. The term "life cycle" is often written without a space, as "lifecycle", with the former more popular in the past and in non-engineering contexts. The acronym SDLC was coined when the longer form was more popular and has remained associated with the expansion, even though the shorter form is popular in engineering. Also, SDLC is relatively unique as opposed to the TLA SDL, which is highly overloaded. == Phases == Depending on the source, the SDLC is described as having different phases and using different terms. Even so, there are common aspects. The following attempts to describe notable phases using notable terminology. The phases are somewhat ordered by the natural sequence of development, although they can be overlapping and iterative. === Conceptualization === During conceptualization (a.k.a. conceptual design, system investigation, feasibility), options and priorities are considered. A feasibility study can determine whether the development effort is worthwhile via activities such as understanding user needs, cost estimation, benefit analysis, and resource analysis. A study should address operational, financial, technical, human factors, and legal/political concerns. === Requirements analysis === Requirements analysis (a.k.a. preliminary design) involves understanding the problem and determining what is needed. Often this involves engaging users to define the requirements and recording them in a document known as a requirements specification. === Design === During the design phase (a.k.a. detail design), a solution is planned. The plan can include relatively high-level information such as describing the major components of the system. The plan can include relatively low-level information such as describing functions, screen layout, business rules, and process flow. The design phase is informed by the requirements of the system. The design must satisfy each requirement. The design may be recorded in textual documents as well as functional hierarchy diagrams, example screen images, business rules, process diagrams, pseudo-code, and data models. === Construction === During construction (a.k.a. implementation, production), the system is realized. Based on the design, hardware and software components are created and integrated. This phase includes testing sub-components, components and the integration of some components, but typically does not include testing at the complete system level. This phase may include the development of training materials, including user manuals and help files. === Acceptance === The acceptance phase (a.k.a. system testing) is about testing the complete system to ensure that it meets customer expectations (requirements). === Deployment === The deployment phase (a.k.a. implementation) involves the logistics of delivery to the customer. Some systems are deployed as a single instance (i.e. in the cloud), and deployment may be ad hoc and manual. Some systems are built in quantity and are associated with manufacturing process and commissioning. This phase may include training users to use the system. It may include transitioning future development to support staff. === Maintenance === During the maintenance phase (a.k.a. operation, utilization, support) development is largely inactive, although this phase does include customer support for resolving user issues and recording suggestions for improvement. Fixes and enhancements are handled by returning to the first phase, conceptualization. For minor changes, the cycle may be significantly abbreviated compared to initial development. === Decommission === Decommission (a.k.a. disposition, retirement, phase-out) is when the system is removed from use, i.e., when it reaches end-of-life. == Practices == === Management and control === SDLC phase objectives are described in this section with key deliverables, a description of recommended tasks, and a summary of related control objectives for effective management. It is critical for the project manager to establish and monitor control objectives while executing projects. Control objectives are clear statements of the desired result or purpose and should be defined and monitored throughout a project. Control objectives can be grouped into major categories (domains), and relate to the SDLC phases as shown in the figure. To manage and control a substantial SDLC initiative, a work breakdown structure (WBS) captures and schedules the work. The WBS and all programmatic material should be kept in the "project description" section of the project notebook. The project manager chooses a WBS format that best describes the project. The diagram shows that coverage spans numerous phases of the SDLC, but the associated MCD (Management Control Domains) shows mappings to SDLC phases. For example, Analysis and Design is primarily performed as part of the Acquisition and Implementation Domain, and System Build and Prototype is primarily performed as part of delivery and support. === Work breakdown structured organization === The upper section of the WBS provides an overview of the project scope and timeline. It should also summarize the major phases and milestones. The middle section is based on the SDLC phases. WBS elements consist of milestones and tasks to be completed rather than activities to be undertaken, and have a deadline. Each task has a measurable output (e.g., an analysis document). A WBS task may rely on one or more activities (e.g., coding). Parts of the project needing support from contractors should have a statement of work (SOW). The development of an SOW does not occur during a specific phase of SDLC but is developed to include the work from the SDLC process that may be conducted by contractors. === Baselines === Baselines are established after four of the five phases of the SDLC, and are critical to the iterative nature of the model. Baselines become milestones. functional baseline: established after the conceptual design phase. allocated baseline: established after the preliminary design phase. product baseline: established after the detailed design and development phase. updated product baseline: established after the production construction phase. In the following diagram, these stages are divided into ten steps, from definition to creation and modification of IT work products:

    Read more →
  • Test data

    Test data

    Test data are sets of inputs or information used to verify the correctness, performance, and reliability of software systems. Test data encompass various types, such as positive and negative scenarios, edge cases, and realistic user scenarios, and aims to exercise different aspects of the software to uncover bugs and validate its behavior. Test data is also used in regression testing to verify that new code changes or enhancements do not introduce unintended side effects or break existing functionalities. == Background == Test data may be used to verify that a given set of inputs to a function produces an expected result. Alternatively, data can be used to challenge the program's ability to handle unusual, extreme, exceptional, or unexpected inputs. Test data can be produced in a focused or systematic manner, as is typically the case in domain testing, or through less focused approaches, such as high-volume randomized automated tests. Test data can be generated by the tester or by a program or function that assists the tester. It can be recorded for reuse or used only once. Test data may be created manually, using data generation tools (often based on randomness), or retrieved from an existing production environment. The data set may consist of synthetic (fake) data, but ideally, it should include representative (real) data. == Limitations == Due to privacy regulations such as GDPR, PCI, and the HIPAA, the use of privacy-sensitive personal data for testing is restricted. However, anonymized (and preferably subsetted) production data may be used as representative data for testing and development. Programmers may also choose to generate synthetic data as an alternative to using real or anonymized data. While synthetic data can offer significant advantages, such as enhanced privacy and flexibility, it also comes with limitations. For instance, generating synthetic data that accurately reflects real-world complexity can be challenging. There is also a risk of synthetic data not fully capturing the nuances of real data, potentially leading to gaps in test coverage. == Domain testing == Domain testing is a set of techniques focusing on test data. This includes identifying critical inputs, values at the boundaries between equivalence classes, and combinations of inputs that drive the system toward specific outputs. Domain testing helps ensure that various scenarios are effectively tested, including edge cases and unusual conditions.

    Read more →
  • Dropbox Carousel

    Dropbox Carousel

    Dropbox Carousel was a photo and video management app offered by Dropbox. The third-party native app, available on Android and iOS, allowed users to store, manage, and organize photos. Photos were organized by date, time and event and backed up on Dropbox. It competed in this space against other online photo storage services such as Google's Google Photos, Apple's iCloud, and Yahoo's Flickr. Chris Lee, Dropbox's head of product development for Carousel described the app as an add-on to Dropbox, a “dedicated experience for photos and videos” and a space for “reliving personal memories”. == History == Mailbox founder, Gentry Underwood unveiled Carousel at a gathering in San Francisco on April 9, 2014. Much of the features in Carousel come from Snapjoy, a photo start-up, that Dropbox acquired on December 19, 2012. When Carousel was launched, it marked amongst many others, a series of acquisitions made by Dropbox to prep up before opening its stock for public offering. The acquisitions would help demonstrate its expansive product offerings pitching potential profitability to investors. In December 2015, Dropbox announced that Carousel would be shut down and some Carousel features would be integrated into the primary Dropbox application. On March 31, 2016, Carousel was deactivated. == Features == Carousel prompted users to free local storage once it had synced and backed-up local photos to the cloud. Flashback was a feature (enabled by default) that showed past photos or videos taken the same day, a year, or some years back. Flashback used an algorithm designed to identify human faces - resulting in greater likelihood of the user's picture or people in the user's close circle appearing. A scrollable timeline, which was earlier a scroll wheel, at the bottom let the user scroll to photo(s) at a specific date with a finger swipe.

    Read more →
  • Hugging Face

    Hugging Face

    Hugging Face, Inc., is an American company based in New York City that develops computation tools for building applications using machine learning. Its transformers library built for natural language processing applications and its platform allow users to share machine learning models and datasets and showcase their work. == History == === Founding === The company was founded in 2016 by French entrepreneurs Clément Delangue, Julien Chaumond, and Thomas Wolf in New York City, originally as a company that developed a chatbot app targeted at teenagers. The company was named after the U+1F917 🤗 HUGGING FACE emoji. After open sourcing the model behind the chatbot, the company pivoted to focus on being a platform for machine learning. === AI boom === On April 28, 2021, the company launched the BigScience Research Workshop in collaboration with several other research groups to release an open large language model. In 2022, the workshop concluded with the announcement of BLOOM, a multilingual large language model with 176 billion parameters. In February 2023, the company announced partnership with Amazon Web Services (AWS) which would allow Hugging Face's products to be available to AWS customers to use them as the building blocks for their custom applications. The company also said the next generation of BLOOM will be run on Trainium, a proprietary machine learning chip created by AWS. In June 2024, the company announced, along with Meta and Scaleway, their launch of a new AI accelerator program for European startups. The initiative aimed to help startups integrate open foundation models into their products, accelerating the EU AI ecosystem. The program, based at STATION F in Paris, ran from September 2024 to February 2025. Selected startups received mentoring, and access to AI models and tools and Scaleway's computing power. On September 23, 2024, to further the International Decade of Indigenous Languages, Hugging Face teamed up with Meta and UNESCO to launch a new online language translator. It was built on Meta's No Language Left Behind open-source AI model, enabling free text translation across 200 languages, including many low-resource languages. In April 2025, Hugging Face announced that they acquired a humanoid robotics startup, Pollen Robotics, based in France and founded by Matthieu Lapeyre and Pierre Rouanet in 2016. In an X tweet, Delangue shared his vision to "make Artificial Intelligence robotics Open Source". === Cyberattacks === In early 2026, hackers hijacked the Hugging Face platform to launch Android-targeted attacks involving "powerful malware" which could completely take over a compromised target.

    Read more →
  • Hamilton C shell

    Hamilton C shell

    Hamilton C shell is a clone of the Unix C shell and utilities for Microsoft Windows created by Nicole Hamilton at Hamilton Laboratories as a completely original work, not based on any prior code. It was first released on OS/2 on December 12, 1988 and on Windows NT in July 1992. The OS/2 version was discontinued in 2003 but the Windows version continues to be actively supported. == Design == Hamilton C shell differs from the Unix C shell in several respects. These include its compiler architecture, its use of threads, and the decision to follow Windows rather than Unix conventions. === Parser === The original C shell uses an ad hoc parser. This has led to complaints about its limitations. It works well enough for the kinds of things users type interactively but not very well for the more complex commands a user might take time to write in a script. It is not possible, for example, to pipe the output of a foreach statement into grep. There was a limit to how complex a command it could handle. By contrast, Hamilton uses a top-down recursive descent parser that allows it to compile statements to an internal form before running them. As a result, statements can be nested or piped arbitrarily. The language has also been extended with built-in and user-defined procedures, local variables, floating point and additional expression, editing and wildcarding operators, including an "indefinite directory" wildcard construct written as "..." that matches zero or more directory levels as required to make the rest of the pattern match. === Threads === Lacking fork or a high performance way to recreate that functionality, Hamilton uses the Windows threads facilities instead. When a new thread is created, it runs within the same process space and it shares all of the process state. If one thread changes the current directory or the contents of memory, it's changed for all the threads. It's much cheaper to create a thread than a process but there's no isolation between them. To recreate the missing isolation of separate processes, the threads cooperate to share resources using locks. === Windows conventions === Hamilton differs from other Unix shells in that it also directly supports Windows conventions for drive letters, filename slashes, escape characters, etc.

    Read more →
  • ISPConfig

    ISPConfig

    ISPConfig is an open source hosting control panel for Linux, licensed under BSD license and developed by the company ISPConfig UG. The ISPConfig project was started in autumn 2005 by Till Brehm from the German company projektfarm GmbH. == Overview == Using the dashboard, administrators have the ability to manage websites, email addresses, MySQL and MariaDB as well as PostgreSQL (since version 3.3) databases, FTP accounts, Shell accounts and DNS records through a web-based interface. The software has 4 login levels: administrator, reseller, client, and email-user, each with a different set of permissions. == Operating Systems == ISPConfig is only available on Linux, with CentOS, Debian, and Ubuntu being among the supported distributions. == Features == The following services and features are supported: Management of a single or multiple servers from one control panel. Web server management for Apache HTTP Server and Nginx. Mail server management (with virtual mail users) with spam and antivirus filter using Postfix (software) and Dovecot (software). DNS server management (BIND, Powerdns). Configuration mirroring and clusters. Administrator, reseller, client and mail-user login. Virtual server management for OpenVZ Servers. Website statistics using Webalizer and AWStats

    Read more →