AI Grammar Sentence Checker

AI Grammar Sentence Checker — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Seam carving

    Seam carving

    Seam carving (or liquid rescaling) is an algorithm for content-aware image resizing, developed by Shai Avidan, of Mitsubishi Electric Research Laboratories (MERL), and Ariel Shamir, of the Interdisciplinary Center and MERL. It functions by establishing a number of seams (paths of least importance) in an image and automatically removes seams to reduce image size or inserts seams to extend it. Seam carving also allows manually defining areas in which pixels may not be modified, and features the ability to remove whole objects from photographs. The purpose of the algorithm is image retargeting, which is the problem of displaying images without distortion on media of various sizes (cell phones, projection screens) using document standards, like HTML, that already support dynamic changes in page layout and text but not images. Image Retargeting was invented by Vidya Setlur, Saeko Takage, Ramesh Raskar, Michael Gleicher and Bruce Gooch in 2005. The work by Setlur et al. won the 10-year impact award in 2015. == Seams == Seams can be either vertical or horizontal. A vertical seam is a path of pixels connected from top to bottom in an image with one pixel in each row. A horizontal seam is similar with the exception of the connection being from left to right. The importance/energy function values a pixel by measuring its contrast with its neighbor pixels. == Process == The below example describes the process of seam carving: The seams to remove depends only on the dimension (height or width) one wants to shrink. It is also possible to invert step 4 so the algorithm enlarges in one dimension by copying a low energy seam and averaging its pixels with its neighbors. === Computing seams === Computing a seam consists of finding a path of minimum energy cost from one end of the image to another. This can be done via Dijkstra's algorithm, dynamic programming, greedy algorithm or graph cuts among others. ==== Dynamic programming ==== Dynamic programming is a programming method that stores the results of sub-calculations in order to simplify calculating a more complex result. Dynamic programming can be used to compute seams. If attempting to compute a vertical seam (path) of lowest energy, for each pixel in a row we compute the energy of the current pixel plus the energy of one of the three possible pixels above it. The images below depict a DP process to compute one optimal seam. Each square represents a pixel, with the top-left value in red representing the energy value of that pixel. The value in black represents the cumulative sum of energies leading up to and including that pixel. The energy calculation is trivially parallelized for simple functions. The calculation of the DP array can also be parallelized with some interprocess communication. However, the problem of making multiple seams at the same time is harder for two reasons: the energy needs to be regenerated for each removal for correctness and simply tracing back multiple seams can form overlaps. Avidan 2007 computes all seams by removing each seam iteratively and storing an "index map" to record all the seams generated. The map holds a "nth seam" number for each pixel on the image, and can be used later for size adjustment. If one ignores both issues however, a greedy approximation for parallel seam carving is possible. To do so, one starts with the minimum-energy pixel at one end, and keep choosing the minimum energy path to the other end. The used pixels are marked so that they are not picked again. Local seams can also be computed for smaller parts of the image in parallel for a good approximation. == Issues == The algorithm may need user-provided information to reduce errors. This can consist of painting the regions which are to be preserved. With human faces it is possible to use face detection. Sometimes the algorithm, by removing a low energy seam, may end up inadvertently creating a seam of higher energy. The solution to this is to simulate a removal of a seam, and then check the energy delta to see if the energy increases (forward energy). If it does, prefer other seams instead. == Implementations == Adobe Systems acquired a non-exclusive license to seam carving technology from MERL, and implemented it as a feature in Photoshop CS4, where it is called Content Aware Scaling. As the license is non-exclusive, other popular computer graphics applications (e. g. GIMP, digiKam, and ImageMagick) as well as some stand-alone programs (e. g. iResizer) also have implementations of this technique, some of which are released as free and open source software. There also exists an implementation for webpages. == Improvements and extensions == Better energy function and application to video by introducing 2D (time+1D) seams. Faster implementation on GPU. Application of this forward energy function to static images. Multi-operator: Combine with cropping and scaling. Much faster removal of multiple seams. Removing seams through neural deformation fields to extend to continuous domains like 3D scenes. A 2010 review of eight image retargeting methods found that seam carving produced output that was ranked among the worst of the tested algorithms. It was, however, a part of one of the highest-ranking algorithms: the multi-operator extension mentioned above (combined with cropping and scaling).

    Read more →
  • Localhost

    Localhost

    In computer networking, localhost is a hostname that refers to the current computer used to access it. The name localhost is reserved for loopback purposes. It is used to access the network services that are running on the host via the loopback network interface. Using the loopback interface bypasses any local network interface hardware. == Loopback == The local loopback mechanism may be used to run a network service on a host without requiring a physical network interface, or without making the service accessible from the networks the computer may be connected to. For example, a locally installed website may be accessed from a Web browser by the URL http://localhost to display its home page. IPv4 network standards reserve the entire address block 127.0.0.0/8 (more than 16 million addresses) for loopback purposes. That means any packet sent to any of those addresses is looped back. The address 127.0.0.1 is the standard address for IPv4 loopback traffic; the rest are not supported by all operating systems. However, they can be used to set up multiple server applications on the host, all listening on the same port number. In the IPv6 addressing architecture there is only a single address assigned for loopback: ::1. The standard precludes the assignment of that address to any physical interface, as well as its use as the source or destination address in any packet sent to remote hosts. == Name resolution == The name localhost normally resolves to the IPv4 loopback address 127.0.0.1, and to the IPv6 loopback address ::1. This resolution is normally configured by the following lines in the operating system's hosts file: 127.0.0.1 localhost ::1 localhost The name may also be resolved by Domain Name System (DNS) servers, but there are special considerations governing the use of this name: An IPv4 or IPv6 address query for the name localhost must always resolve to the respective loopback address. Applications may resolve the name to a loopback address themselves, or pass it to the local name resolver mechanisms. When a name resolver receives an address (A or AAAA) query for localhost, it should return the appropriate loopback addresses, and negative responses for any other requested record types. Queries for localhost should not be sent to caching name servers. To avoid burdening the Domain Name System root servers with traffic, caching name servers should never request name server records for localhost, or forward resolution to authoritative name servers. When authoritative name servers receive queries for 'localhost' in spite of the provisions mentioned above, they should resolve them appropriately. In addition to the mapping of localhost to the loopback addresses (127.0.0.1 and ::1), localhost may also be mapped to other IPv4 (loopback) addresses and it is also possible to assign other, or additional, names to any loopback address. The mapping of localhost to addresses other than the designated loopback address range in the hosts file or in DNS is not guaranteed to have the desired effect, as applications may map the name internally. In the Domain Name System, the name .localhost is reserved as a top-level domain name, originally set aside to avoid confusion with the hostname localhost. Domain name registrars are precluded from delegating domain names in the top-level .localhost domain. == Historical notes == In 1981, the block 127.0.0.0/8 got a 'reserved' status, as not to assign it as a general purpose class A IP network. This block was officially assigned for loopback purposes in 1986. Its purpose as a Special Use IPv4 Address block was confirmed in 1994,, 2002, 2010,, and last in 2013. From the outset, in 1995, the single IPv6 loopback address ::1 was defined. Its purpose and definition was unchanged in 1998,, 2003,, and up to the current definition, in 2006. == Packet processing == The processing of any packet sent to a loopback address, is implemented in the link layer of the TCP/IP stack. Such packets are never passed to any network interface controller (NIC) or hardware device driver and must not appear outside of a computing system, or be routed by any router. This permits software testing and local services, even in the absence of any hardware network interfaces. Looped-back packets are distinguished from any other packets traversing the TCP/IP stack only by the special IP address they were addressed to. Thus, the services that ultimately receive them respond according to the specified destination. For example, an HTTP service could route packets addressed to 127.0.0.99:80 and 127.0.0.100:80 to different Web servers, or to a single server that returns different web pages. To simplify such testing, the hosts file may be configured to provide appropriate names for each address. Packets received on a non-loopback interface with a loopback source or destination address must be dropped. Such packets are sometimes referred to as Martian packets. As with any other bogus packets, they may be malicious and any problems they might cause can be avoided by applying bogon filtering. == Special cases == The releases of the MySQL database differentiate between the use of the hostname localhost and the use of the addresses 127.0.0.1 and ::1. When using localhost as the destination in a client connector interface of an application, the MySQL application programming interface connects to the database using a Unix domain socket, while a TCP connection via the loopback interface requires the direct use of the explicit address. One notable exception to the use of the 127.0.0.0/8 addresses is their use in Multiprotocol Label Switching (MPLS) traceroute error detection, in which their property of not being routable provides a convenient means to avoid delivery of faulty packets to end users.

    Read more →
  • Electronic lab notebook

    Electronic lab notebook

    An electronic lab notebook or electronic laboratory notebook (ELN) is a computer program designed to replace paper laboratory notebooks. Lab notebooks in general are used by scientists, engineers, and technicians to document research, experiments, and procedures performed in a laboratory. A lab notebook is often maintained to be a legal document and may be used in a court of law as evidence. Similar to an inventor's notebook, the lab notebook is also often referred to in patent prosecution and intellectual property litigation. Electronic lab notebooks offer many benefits to the user as well as organizations; they are easier to search upon, simplify data copying and backups, and support collaboration amongst many users. ELNs can have fine-grained access controls, and can be more secure than their paper counterparts. They also allow the direct incorporation of data from instruments, replacing the practice of printing out data to be stapled into a paper notebook. == Types == ELNs can be divided into two categories: "Specific ELNs" contain features designed to work with specific applications, scientific instrumentation or data types. "Cross-disciplinary ELNs" or "Generic ELNs" are designed to support access to all data and information that needs to be recorded in a lab notebook. Lab Platforms that combine an ELN, LIMS, and scientific data management together, all-in-one configurable software environment. Solutions range from specialized programs designed from the ground up for use as an ELN, to modifications or direct use of more general programs. Examples of using more general software as an ELN include using OpenWetWare, a MediaWiki install (running the same software that Wikipedia uses), WordPress, or the use of general note taking software such as OneNote as an ELN. ELN's come in many different forms. They can be standalone programs, use a client-server model, or be entirely web-based. Some use a lab-notebook approach, others resemble a blog. ELNs are embracing artificial intelligence and LLM technology to provide scientific AI chat assistants. A good many variations on the "ELN" acronym have appeared. Differences between systems with different names are often subtle, with considerable functional overlap between them. Examples include "ERN" (Electronic Research Notebook), "ERMS" (Electronic Resource (or Research or Records) Management System (or Software) and SDMS (Scientific Data (or Document) Management System (or Software). Ultimately, these types of systems all strive to do the same thing: Capture, record, centralize and protect scientific data in a way that is highly searchable, historically accurate, and legally stringent, and which also promotes secure collaboration, greater efficiency, reduced mistakes and lowered total research costs. == Objectives == A good electronic laboratory notebook should offer a secure environment to protect the integrity of both data and process, whilst also affording the flexibility to adopt new processes or changes to existing processes without recourse to further software development. The package architecture should be a modular design, so as to offer the benefit of minimizing validation costs of any subsequent changes that you may wish to make in the future as your needs change. A good electronic laboratory notebook should be an "out of the box" solution that, as standard, has fully configurable forms to comply with the requirements of regulated analytical groups through to a sophisticated ELN for inclusion of structures, spectra, chromatograms, pictures, text, etc. where a preconfigured form is less appropriate. All data within the system may be stored in a database (e.g. MySQL, MS-SQL, Oracle) and be fully searchable. The system should enable data to be collected, stored and retrieved through any combination of forms or ELN that best meets the requirements of the user. The application should enable secure forms to be generated that accept laboratory data input via PCs and/or laptops / palmtops, and should be directly linked to electronic devices such as laboratory balances, pH meters, etc. Networked or wireless communications should be accommodated for by the package which will allow data to be interrogated, tabulated, checked, approved, stored and archived to comply with the latest regulatory guidance and legislation. A system should also include a scheduling option for routine procedures such as equipment qualification and study related timelines. It should include configurable qualification requirements to automatically verify that instruments have been cleaned and calibrated within a specified time period, that reagents have been quality-checked and have not expired, and that workers are trained and authorized to use the equipment and perform the procedures. == Regulatory and legal aspects == The laboratory accreditation criteria found in the ISO 17025 standard needs to be considered for the protection and computer backup of electronic records. These criteria can be found specifically in clause 4.13.1.4 of the standard. Electronic lab notebooks used for development or research in regulated industries, such as medical devices or pharmaceuticals, are expected to comply with FDA regulations related to software validation. The purpose of the regulations is to ensure the integrity of the entries in terms of time, authorship, and content. Unlike ELNs for patent protection, FDA is not concerned with patent interference proceedings, but is concerned with avoidance of falsification. Typical provisions related to software validation are included in the medical device regulations at 21 CFR 820 (et seq.) and Title 21 CFR Part 11. Essentially, the requirements are that the software has been designed and implemented to be suitable for its intended purposes. Evidence to show that this is the case is often provided by a Software Requirements Specification (SRS) setting forth the intended uses and the needs that the ELN will meet; one or more testing protocols that, when followed, demonstrate that the ELN meets the requirements of the specification and that the requirements are satisfied under worst-case conditions. Security, audit trails, prevention of unauthorized changes without substantial collusion of otherwise independent personnel (i.e., those having no interest in the content of the ELN such as independent quality unit personnel) and similar tests are fundamental. Finally, one or more reports demonstrating the results of the testing in accordance with the predefined protocols are required prior to release of the ELN software for use. If the reports show that the software failed to satisfy any of the SRS requirements, then corrective and preventive action ("CAPA") must be undertaken and documented. Such CAPA may extend to minor software revisions, or changes in architecture or major revisions. CAPA activities need to be documented as well. Aside from the requirements to follow such steps for regulated industry, such an approach is generally a good practice in terms of development and release of any software to assure its quality and fitness for use. There are standards related to software development and testing that can be applied (see ref.).

    Read more →
  • Key Transparency

    Key Transparency

    Key Transparency allows communicating parties to verify public keys used in end-to-end encryption. In many end-to-end encryption services, to initiate communication a user will reach out to a central server and request the public keys of the user with which they wish to communicate. If the central server is malicious or becomes compromised, a man-in-the-middle attack can be launched through the issuance of incorrect public keys. The communications can then be intercepted and manipulated. Additionally, legal pressure could be applied by surveillance agencies to manipulate public keys and read messages. With Key Transparency, public keys are posted to a public log that can be universally audited. Communicating parties can verify public keys used are accurate.

    Read more →
  • D4Science

    D4Science

    D4Science is a Data Infrastructure offering services by community-driven virtual research environments. In particular, it supports communities of practice willing to implement open science practices, thus it is an Open Science Infrastructure. The infrastructure follows the system of systems approach, where the constituent systems (Service providers) offer "resources" (namely services and by them data, computing, storage) assembled together to implement the overall set of D4Science services. In particular, D4Science aggregates "domain agnostic" service providers as well as community-specific ones to build a unifying space where the aggregated resources can be exploited via Virtual research Environments and their services. It is spread across several sites, the primary one is hosted by the Istituto di Scienza e Tecnologie dell'Informazione of National Research Council (Italy). At the earth of this infrastructure there is an Open Source Software named gCube system. == Services == D4Science offers: Virtual Research Environment as a Service providing any community of practice with a dedicated working environment supporting any knowledge production process in a collaborative way, in fact every VRE enables computer-supported cooperative work by design. D4Science-based VREs are web-based, community-oriented, collaborative, user-friendly, open-science-enabler working environments for scientists and practitioners willing to work together to perform a set of (research) task. From the end-user perspective, each VRE manifests in a unifying web application (and a set of application programming interfaces (APIs)): (a) comprising several applications organised in specific menu items and (b) running in a plain web browser. Every application is providing VRE users with facilities implemented by relying on one or more services provisioned by diverse providers. Among the basic services every VRE is equipped with there are a Social Networking area enabling collaborative and open discussions on any topic and disseminating information of interest for the community, for example, the availability of a research outcome; a Workspace for storing, organizing and sharing any version of a research artifact, including dataset and model implementation; a User Management dashboard for managing membership and roles; a Catalogue Service recording the assets worth being published thus to make it possible for others to be informed and make use of these assets. Science Gateway as a Service providing a community of practice with a dedicated science gateway hosting a selected set of virtual research environments. Data Analytics at scale for data analytics including: a proprietary data analytics platform (DataMiner) to execute analytics tasks either by relying on methods provided by the user or by others. It is endowed with importing and sharing facilities for analytics methods implemented in heterogeneous forms including R, Java, Python, and KNIME. The platform enacts tasks execution by a distributed and hybrid computing infrastructure. Moreover, one of the worth highlighting feature of this platform is its open science-friendliness. All the analytics methods integrated in it are exposed by a standard protocol (the OGC WPS protocol) clients can use to get informed on available methods as well as to start processes, monitor their execution and access results. Every analytics task performed by the platform automatically produces a provenance record catering for the reproducibility of the task; an RStudio-based development environment for R enabling to perform statistical computing tasks in the cloud. This RStudio environment is (i) preconfigured with libraries and packages to ease the execution of common data analytics tasks, and (ii) provides seamless access to the VRE Workspace enabling sharing of resources with other members of the same working environment. a Jupyter-based notebook environment for developing and executing interactive computing by JupyterLab instances. Each JupyterLab is (i) preconfigured with libraries and packages to ease the execution of common data analytics tasks, and (ii) provides access to the VRE Workspace enabling sharing of resources with other members of the same working environment. == Community == The D4Science Infrastructure serves more than 24,000 registered users (August 2024) through 177 active VREs offered via 20 Science gateways. This extensive infrastructure not only supports a diverse range of scientific communities but also fosters significant engagement and collaboration among researchers worldwide. Engagement within the D4Science community is robust, with users benefiting from user-friendly application environments tailored to their specific needs. The platform allows users to securely preserve, access, and share their data from anywhere, fostering a collaborative and inclusive research environment. Additionally, groups of users can create their own virtual environments and customise them with the applications they need, further enhancing the platform's flexibility and usability. Supported communities and cases range from Agri-food to Social Data Science, Earth Science and Marine Science. These diverse applications demonstrate the versatility and broad applicability of the D4Science Infrastructure, making it an invaluable resource for researchers across various scientific domains. == History == The D4Science development has been supported by several European-funded projects. DILIGENT (2004-2007) in the Sixth Framework Programme for Research and Technological Development was the forerunner where a testbed infrastructure built by integrating digital library and grid computing technologies and resources was conceived and developed to serve the needs of communities of practice involved in knowledge development. In the context of the Seventh Framework Programme for research, technological development and demonstration the development of the D4Science initiative. In this period the infrastructure was established and developed to serve communities of practices from domains ranging from Earth Science to Marine Science with worldwide scope In the context of the H2020 research and innovation programme the maturity level of the D4Science infrastructure was high enough to allow a large and very diverse set of communities of practice to benefit from it and its services and further contribute to its development. Moreover, the services offered by the infrastructure have been developed to support open science practices. The operation and improvement of the D4Science infrastructure facilities are still ongoing while its exploitation is progressively growing.

    Read more →
  • Social media reach

    Social media reach

    Social media reach is a media analytics metric that refers to the number of users who have come across a particular content on a particular social media platform. Social media platforms have their own individual ways of tracking, analyzing and reporting the traffic on each of the individual platforms. As these platforms are a main source of communication between companies and their target audiences, by conducting research, companies are able to utilize analytical information, such as the reach of their posts, to better understand the interactions between the users and their content. There are multiple underlying factors that will determine what shows up on a newsfeed or timeline. Algorithms, for example, are a type of factor that can alter the reach of a post due to the way the algorithm is coded, which can affect who sees a post and when. Other examples of factors that can impede the reach can include the time at which posts are made, as well as how frequent the posts are between one another. In comparison, an impression is the total number of circumstances where content has been shown on a social timeline, meanwhile, engagement looks at how people interact with the content that they see on a social platform such as like, share or retweet. == Reach on Facebook == Facebook has their own analytic platform which allows the user to see how other users are interacting with their posts, with the use of multiple metrics. This is not something the average user uses, but rather a tool that is used by pages or public figures. For example, Facebook pages that represent a business often look at the activity their posts have generated. There are three types of reach that can be looked at on the Facebook analytic platform. === Types of reach === ==== Organic Reach ==== This type of reach regards the number of distinct users that have seen a specific post on their feed. Organic reach, in other words is the number of people who have seen the post being analyzed on their Facebook newsfeed. Data gathered from this type of reach can give intel to those doing the analysis, such as the demographics of those who have seen the post. ==== Paid Reach ==== This type of reach regards the number of times that distinct users have come across sponsored posts, ads or content. In other words, paid reach is the number of times Facebook users have seen a post that has been paid for by a company. Data collected can give insight, to advertisers or marketers for example, on the activity based around the reach of their post. ==== Viral Reach ==== This type of reach regards the number of views by distinct users on posts that have been commented on or shared by their friends on Facebook. In other words, viral reach looks at the number of people who have seen a post after a friend of theirs commented or shared the original post, therefore it showed on their timeline. Viral reach can be looked at in terms of a collective number of times that the post has been on individual user's timelines. Data collected from viral reach can be used in multiple ways, for example, it can be used to analyze the type of content that gets shared or commented on and can be further used to compare to other posts. === Engaged users === This refers to the number of individual users who have clicked and interacted with a post on Facebook. == Reach on Twitter == Twitter gives access to any of their users to analytics of their tweets as well as their followers. Their dashboard is user friendly, which allows anyone to take a look at the analytics behind their Twitter account. This open access is useful for both the average user and companies as it can provide a quick glance or general outlook of who has seen their tweets. The way that Twitter works is slightly different than the way of Facebook in terms of the reach. On Twitter, especially for users with a higher profile, they are not only engaging with the people who follow them, but also with the followers of their own followers. The reach metric on Twitter looks at the quantity of Twitter users who have been engaged, but also the number of users that follow them as well. This metric is useful to see the if the tweets/content being shared on Twitter are contributing to the growth of audience on this platform. == Reach on Instagram == Instagram gives their users access to their reach, in the Instagram Insights section. Instagram insights can be used to learn more about an account's followers and performance. Reach indicates the total number of unique Instagram accounts that have seen your Instagram post or story. You can find this data by looking at each individual post insights. == Uses of reach == The reach can be a useful metric to analyze for marketers and advertisers. Social media is a platform that is used by marketers to directly target their intended audience with ease. These platforms not only allow marketers to get a better understanding of their audience, but also allow advertisers to insert their ads onto the timelines of specific users to later be able to conduct research to see the reach of their posts/content. The basic goal of marketers is to increase their reach as much as possible to impact bigger audiences of their dream customers and, in the end, make more sales. When doing organic social media marketing, using paid methods like ads or doing influencer marketing whether it is paid or free, it allows marketers to track the performance of their strategy and tweak it based on what works and what does not. == Analytics and reach == Social analytics looks at the data collected based on the interactions of users on social media platforms. A lot of information can be gathered which can provide intel based on user activities on social media. When looking into analytics in regard to social media, each company or group has a different goal in mind to engage their audience. At a glance, the three might seem as if they are very similar, however the differences between them are significant. There are many aspects that can be analyzed from the data gathered from social media platforms, depending on what is being observed, the correct metric would then be selected to further analyze. One example of the many metrics that can be used through social analytics is the reach. == Reach formula == To calculate social media reach one can use the following formula: R = I f ¯ {\displaystyle R={\frac {I}{\bar {f}}}} where R {\displaystyle R} — is social media reach, I {\displaystyle I} stands for the number of impressions, f ¯ {\displaystyle {\bar {f}}} is the average frequency of impressions per user. f ¯ {\displaystyle {\bar {f}}} represents the number of events when the ad is shown to a particular user. The average value should be calculated over the time period with stable settings of advertisement campaign. == Commenting For Better Reach == Commenting For Better Reach also known as "CFBR" is a widely used strategy for organically boosting post reach on social media platforms. Algorithms tend to favor posts with substantial likes and comments, granting them broader exposure compared to less engaging content. Primarily seen on LinkedIn, a platform geared toward professional networking and business connections, the use of CFBR signals active engagement aimed at enhancing post visibility. It is important to note that genuine and meaningful comments are key to effective engagement. Spammy or irrelevant comments not only detract from the conversation but may also limit a post's potential reach and impact.

    Read more →
  • VK (service)

    VK (service)

    VK (short for its original name VKontakte; Russian: ВКонтакте, lit. 'InContact') is a Russian online social media and social networking service based in Saint Petersburg. VK is available in multiple languages but it is predominantly used by Russian speakers. VK users can message each other publicly or privately, edit messages, create groups, public pages, and events; share and tag images, audio, and video; and play browser-based games. As of August 2018, VK had at least 500 million accounts. As of November 2022, it was the sixth most popular website in Russia. The network was also popular in Ukraine until it was banned by the Verkhovna Rada in 2017. According to Semrush, in 2024, VK was the 30th most visited website in the world; as YouTube is subject to blocking in Russia, VK Video overtook Google's top position in monthly web traffic for the first time in December 2024, as part of the major substitution to domestic business. == History == VKontakte was conceived in 2006 when Pavel Durov, creator of the popular student forum spbgu.ru, met his former classmate Vyacheslav Mirilashvili in St. Petersburg after graduating from the Faculty of Philology at St Petersburg State University. Vyacheslav showed Durov the increasingly popular Facebook, after which the friends decided to create a new Russian social network. Lev Leviev, an Israeli classmate of Vyacheslav Mirilashivili, became the third co-founder. Vyacheslav Mirilashvili borrowed the money from his billionaire father and became the largest shareholder. Lev Leviev took over operational management, and Durov became CEO. Pavel Durov convinced his older brother Nikolai, a multiple winner of international math and programming competitions, to develop the site. Durov launched VKontakte for beta testing in September 2006. The following month, the domain name Vkontakte.ru was registered. The new project was incorporated on 19 January 2007 as a Russian private limited company. In February 2007 the site reached a user base of over 100,000 and was recognized as the second largest company in Russia's nascent social network market. In the same month, the site was subjected to a severe DDoS attack, which briefly put it offline. The user base reached 1 million in July 2007, and 10 million in April 2008. In December 2008 VK overtook rival Odnoklassniki as Russia's most popular social networking service. == Website == Similar to many social networks, the platform's fundamental features revolve around private messaging, sharing photos, posting status updates, and exchanging links with friends. VK also provides tools for administering online communities and managing celebrity pages. The site allows its users to upload, search and stream media content, such as videos and music. VK features an advanced search engine, that allows complex queries for finding friends, as well as a real-time news search. VK updated its features and design in April 2016. === Features === Messaging. VK Private Messages can be exchanged between groups of 2 to 500 people. An email address can also be specified as the recipient. Each message may contain up to 10 attachments: Photos, Videos, Audio Files, Maps (an embedded map with a manually placed marker), and Documents. News. VK users can post on their profile walls, each post may contain up to 10 attachments – media files, maps, and documents (see above). User mentions and hashtags are supported. In the case of multiple photo attachments, the previews are automatically scaled and arranged in a magazine-style layout. The news feed can be switched between all news (default) and most interesting modes. The site features a news-recommendation engine, global real-time search, and individual search for posts and comments on specific users' walls. Communities. VK features three types of communities. Groups are better suited for decentralized communities (discussion boards, wiki-style articles, editable by all members, etc.). Public pages is a news feed-orientated broadcasting tool for celebrities and businesses. The two types are largely interchangeable, the main difference being in the default settings. The third type of community is called Events, which are used for appropriately organizing concerts and events in an appropriate way. Like buttons. VK like buttons for posts, comments, media, and external sites operate differently from Facebook. Liked content doesn't get automatically pushed to the user's wall, but is saved in the private Favorites section instead. The user has to press a second 'share with friends' button to share an item on their wall or send it via private message to a friend. Privacy. Users can control the availability of their content within the network and on the Internet. Blanket and granular privacy settings are available for pages and individual content. Synchronization with other social networks. Any news published on the VK wall will appear on Facebook or Twitter. Certain news may not be published by clicking on the logo next to the "Send" button. Editing a post in VK does not change the post in Facebook or Twitter and vice versa. However, removing the news in VK will remove it from other social networks. SMS service. Russian users can receive and reply to a private message or leave a comment for community news using SMS. Music. Users have access to the audio files uploaded by other users. In addition, users can upload the audio files themselves, create playlists and share audios with others by attaching to messages and wall posts. The uploaded audio files cannot violate copyright laws. === Popularity === As of May 2017, according to Alexa Internet ranking, VK is one of the most visited websites in some Eurasian countries. It is: 4th most visited in Russia; 3rd most visited in Belarus; 6th most visited in Kazakhstan; 8th most visited in Kyrgyzstan and Moldova; 12th most visited in Latvia. It was the fourth most viewed site in Ukraine until, in May 2017, the Ukrainian government banned the use of VK in Ukraine. According to a study for May 2018 conducted by Factum Group Ukraine VK remained the fourth most viewed site in Ukraine, but Facebook was twice as much visited. For 2019, VK appeared as the most visited social network in Ukraine according to Alexa. According to the Internet Association of Ukraine the share of Ukrainian Internet users who visit VK daily had fallen from 54% to 10% from September 2016 to September 2019. They also claimed in November 2019 that Facebook was the most popular social network. VK was expected to gain most of the users lost by Facebook and Instagram after they were blocked in Russia in 2022, according to a Calltouch poll. == Ownership == Initially, founder and CEO Pavel Durov owned 20% of shares (although he had majority voting power through proxy votes), and a trio of Russian-Israeli investors Yitzchak Mirilashvili, his father Mikhael Mirilashvili, and Lev Leviev owned 60%, 10%, and 10% respectively. In 2007, Digital Sky Technologies, an investment company managed by Yuri Milner, acquired a total of 24.99% of the shares from shareholders, investing $16.3 million. In preparation for the IPO in September 2010, DST separated international and Russian assets: the former formed the DST Global fund, while the latter, including VKontakte and rival social network Odnoklassniki, were merged into Mail.ru Group. Mail.ru Group used part of the money to acquire 7.5% of the social network for $112.5 million at a valuation of the entire project of 1.5 billion dollars. After exercising a 7.5% option in July 2011 for $111.7 million, Mail.ru Group accumulated a 39.99% stake in VKontakte. The head of Mail.ru Group, Dmitry Grishin, voiced the company's intention to gain 100% control over VKontakte. MRG was discussing with shareholders to buy out shares from the valuation of the entire company in $2-3 billion. In the summer of 2011, Mirilashvili and Leviev were ready to accept in payment owned by Mail.ru Group shares of Facebook, Groupon, and Zynga, but the deal failed due to Durov's unwillingness to sell a stake on MRG terms. Later, the co-founders considered VKontakte's IPO as an alternative. In March 2012, Durov "accidentally" became plugged into the negotiations where Mirilashvili and Leviev discussed selling their stakes directly to Mail.ru Group's main investor, Alisher Usmanov. On the same day, Durov deleted the pages of the first co-investors, stopped contacting them, and soon announced that VKontakte would postpone its IPO indefinitely. On 29 May 2012, Mail.ru Group announced its decision to yield control of the company to Durov by offering him the voting rights on its shares. Combined with Durov's personal 12% stake, this gave him 52% of the votes. In April 2013, the Mirilashvili family sold its 40% share in VK to United Capital Partners for $1.12 billion, while Lev Leviev sold his 8% share in the same deal, giving United Capital Partners 48% ownership. In January 2014, VK's founder Pavel Durov sold his 12% stake in the company to I

    Read more →
  • Data marketplace

    Data marketplace

    Data marketplace is an online platform for sharing and consuming data in the form of data assets or data products. Part of the data management stack, it aims to bring together data producers and data consumers (including business users and AI) in a single space, with the objective of increasing access to understandable, high-quality data. Included within its Data Marketplaces and Exchange (DME) category by Gartner, data marketplaces can provide data internally within an organization, externally with partners, or as open data. == Concept == Digitization has dramatically increased data volumes within organizations, with IDC predicting that by 2025 the world will contain 175 zettabytes of data. This has created a need to both manage this data and provide access to it to enable business intelligence and data analysis. However, data is often scattered within multiple systems (such as data warehouses and data lakes), and is in formats that are only understandable by technical experts, such as data scientists. According to IDC, 81% of IT leaders cite data silos as a major barrier to digital transformation. This means that data is not freely available to business users or external audiences such as partners or citizens, limiting its value, and holding back AI deployments. Data marketplaces solve this issue, providing seamless, self-service access to high-quality data in an understandable, secure and auditable manner. They break down data silos, reduce friction in data access, and enable a broader range of users, including non-technical profiles, to find, understand, and consume data autonomously. Data assets on the marketplace can be raw data, data visualizations or data products. Data marketplaces combine data management functions such as data governance with the user-friendly experience offered by e-commerce marketplaces in order to increase the usage of data. These include features such as powerful search engines, feedback, ratings, subscriptions and product description sheets. According to Gartner, data marketplaces provide infrastructure, transactional capabilities, and services for both consumers and providers of data assets. == History and timeline == Data marketplaces have evolved since they first emerged in terms of both their scope and usage. === 2000s === With the rise of the internet, data brokers began collecting, aggregating, distributing and selling personal, financial and marketing data to third parties online. Data marketplaces were deployed to monetize this data, making it discoverable and accessible to users, either through subscriptions or one-off purchases. At the same time, regulations, such as the US Open Government Initiative of 2009 and others around the world mandated greater transparency and data sharing with the public. Data sharing portals were created by public and government bodies to make this information available through self-service to all users. === 2010s === Due to the growth of big data and cloud platforms, cloud-based data exchange platforms emerged. These were offered by major infrastructure providers, and included Amazon Web Services (AWS) Data Exchange, Snowflake Data Marketplace, and the Google Cloud Platform. These platforms moved beyond simple data brokerage or open data by providing structured, catalogued data sharing between organizations. === 2020s === Driven by a need to increase internal data sharing with both business users and AI, organizations are now looking to adopt internal data marketplaces. These aim to democratize data consumption by providing seamless access for all employees and AI to trusted data, including data products, through an intuitive, e-commerce style experience. According to Gartner analyst Richa Jha, "by providing a single, governed platform for discovering, sharing, and scaling data products, data marketplaces drive productivity, collaboration, and ROI across the enterprise." == Data marketplaces within the overall data architecture == Data marketplaces provide a consumption and collaboration layer for data. That means they complement and integrate with other parts of the overall data architecture, including: === Data warehouses and data lakes === Data marketplaces connect to data sources, such as data warehouses or data lakes, to provide intuitive access to the data stored within them, enabling data to be shared and distributed to non-technical audiences. Access can be direct, with data and data products stored within the data marketplace or virtualized. === Data catalog === A data catalog provides a technical inventory of an organization's data estate. It collects technical information on all available data assets within an organization, based on metadata descriptions. This ensures traceability, and supports compliance and governance requirements. Unlike a data marketplace, a data catalog does not provide access to data, and is designed to be used by data professionals, rather than the business. This means it lacks an intuitive, understandable interface and is consequently not easily accessible by business users. === Data mesh === Data mesh is an architecture and framework for data management, first defined by Zhamak Dehghani in 2019. It aims to decentralize data ownership to delegate responsibility, empowering teams and focusing on delivering data to users in the form of self-service data products. The data marketplace is a central pillar of data mesh, providing intuitive access to these data products, and creating a collaboration space for data owners and data consumers. === Data product === Data products are high-value, consumable data assets that package high-quality data and associated tools to enable seamless usage by business users at scale. First defined by McKinsey in 2022, they have an identified owner, a service level agreement (SLA), and a reusability logic. == Core components of a data marketplace == A data marketplace typically includes specific core components: === E-commerce style interface === An e-commerce style experience that engages non-technical users, minimizes the need for training and builds confidence and trust in data. Look and feel should be customizable to incorporate corporate design guidelines to ensure consistency with other organizational applications. === Built-in data catalog === As in a standalone data catalog, this indexes all available data, based on metadata that includes type, source, owner, freshness, and quality level. === Discovery and search engine === This enables users to search, filter, explore and discover available data intuitively. As in an e-commerce marketplace, it should be intelligent, and provide relevant results based on natural language queries. === Access control and security management === Data marketplaces will contain data that needs to be protected under regulations such as the General Data Protection Regulation (GDPR) in Europe, the California Consumer Privacy Act (CCPA) in the United States, and sector-specific frameworks in industries such as finance and healthcare. To ensure both security and compliance while maximizing data consumption, the data marketplace should include granular access management and a full audit trail. === Semantic layer and business glossary === Different parts of the business are likely to use different terms to describe data. This leads to inconsistencies and an inability to share data across systems and teams. The semantic layer and business glossary standardize a shared vocabulary and common definitions of business indicators and concepts, providing a single language for data across the business and for AI agents. === Data governance mechanisms === These enforce corporate data governance policies, ensuring data traceability through data lineage, quality certification, usage monitoring, and continuous improvement through user feedback loops. === Collaboration features === As on an e-commerce website, a data marketplace should provide collaboration features that bring together data users and data owners. This includes the ability to rate data products, share use cases, and provide feedback to data owners, creating a community around data and supporting a data-driven culture. == Types of data marketplace == While they share the same underlying technology, data marketplaces can be deployed in three broad ways: === Internal data marketplaces === These bring together data from across an organization and make it available via self-service to employees from across the business. They aim to widen access to data and consequently to improve decision-making and reporting, increase performance and maximize efficiency. === Ecosystem data marketplaces === These extend sharing beyond a single organization, enabling multiple partners (public institutions, industry players, research bodies) to share and consume data within a governed framework. Data can be provided by all parties or simply by one organization and consumed by others. Ecosystem data marketplaces are particularly relevant in

    Read more →
  • Artificial intelligence in fraud detection

    Artificial intelligence in fraud detection

    Artificial intelligence is used by many different businesses and organizations. It is widely used in the financial sector, especially by accounting firms, to help detect fraud. In 2022, PricewaterhouseCoopers reported that fraud has impacted 46% of all businesses in the world. The shift from working in person to working from home has brought increased access to data. According to an FTC (Federal Trade Commission) study from 2022, customers reported fraud of approximately $5.8 billion in 2021, an increase of 70% from the year before. The majority of these scams were imposter scams and online shopping frauds. Furthermore, artificial intelligence plays a crucial role in developing advanced algorithms and machine learning models that enhance fraud detection systems, enabling businesses to stay ahead of evolving fraudulent tactics in an increasingly digital landscape. == Tools == === Expert systems === Expert systems were first designed in the 1970s as an expansion into artificial intelligence technologies. Their design is based on the premise of decreasing potential user error in decision-making and emulating mental reasoning used by experts in a particular field. They differentiate themselves from traditional linear reasoning models by separating identified points in data and processing them individually at the same time. Though, these systems do not rely purely on machine-learned intelligence. Information regarding rules, practices, and procedures in the form of "if-then" statements are implemented into the programming of the system. Users interact with the system by feeding information into the system either through direct entry or import of external data. An inference system compares the information provided by the user with corresponding rules that are believed to specifically apply to the situation. Using this information and the corresponding rules will be used to create a solution to the user's query. Expert systems will generally not operate properly when the common procedures for a specified situation are ambiguous due to the need for well-defined rules. Implementation of expert systems in accounting procedures is feasible in areas where professional judgment is required. Situations where expert systems are applicable include investigations into transactions that involve potential fraudulent entries, instances of going concern, and the evaluation of risk in the planning stages of an audit. === Continuous auditing === Continuous auditing is a set of processes that assess various aspects of information gathered in an audit to classify areas of risk and potential weaknesses in financial Internal controls at a more frequent rate than traditional methods. Instead of analyzing recorded transactions and journal entries periodically, continuous auditing focuses on interpreting the character of these actions more frequently. The frequency of these processes being undertaken as well as highlighting areas of importance is up to the discretion of their implementer, who commonly makes such decisions based on the level of risk in the accounts being evaluated and the goals of implementing the system. Performance of these processes can occur as frequently as being nearly instantaneous with an entry being posted. The processes involved with analyzing financial data in continuous auditing can include the creation of spreadsheets to allow for interactive information gathering, calculation of financial ratios for comparison with previously created models, and detection of errors in entered figures. A primary goal of this practice is to allow for quicker and easier detection of instances of faulty controls, errors, and instances of fraud. === Machine learning and deep learning === The ability of machine learning and deep learning to swiftly and effectively sort through vast volumes of data in the forms of various documents relevant to companies and documents being audited makes them applicable to the domains of audit and fraud detection. Examples of this include recognizing key language in contracts, identifying levels of risk of fraud in transactions, and assessing journal entries for misstatement. == Applications == === 'Big 4' Accounting Firms === Deloitte created an Al-enabled document-reviewing system in 2014. The system automates the method of reviewing and extracting relevant information from different business documents. Deloitte claims that this innovation has made a difference by reducing time spent going through lawful contract documents, invoices, money-related articulations, and board minutes by up to 50%. Working with IBM's Watson, Deloitte is developing cognitive-technology-enhanced commerce arrangements for its clients. LeasePoint is fueled by IBM TRIRIGA (this product evolved into IBM Maximo Real Estate and Facilities) and uses Deloitte's industrial information to create an end-to-end leasing portfolio. Automated Cognitive Resource Assessment employs IBM's Maximo innovation to progress the proficiency of asset inspection. Ernst and Young (EY) connected Al to the investigation of lease contracts. EY (Australia) has also received Al-enabled auditing technology. Collaborating with H20.ai, PwC developed an Al-enabled framework (GL.ai) capable of analyzing reports and preparing reports. PwC claims to have made a significant investment in normal dialect processing (NLP), an Al-enabled innovation to process unstructured information efficiently. KPMG built a portfolio of Al instruments, called KPMG Ignite, to upgrade trade decisions and forms. Working with Microsoft and IBM Watson, KPMG is creating instruments to coordinate Al, data analytics, Cognitive Technologies, and RPA. == Advantages == === Efficiency === The process of auditing an entity in an attempt to detect fraudulent activity requires the repeating of investigatory processes until an error or misstatement may be identified. Under traditional methods, these processes would be carried out by a human being. Proponents of artificial intelligence in fraud detection have stated that these traditional methods are inefficient and can be more quickly accomplished with the aid of an intelligent computing system. A survey of 400 chief executive officers created by KPMG in 2016 found that approximately 58% believed that artificial intelligence would play a key role in making audits more efficient in the future. === Data interpretation === Higher levels of fraud detection entail the use of professional judgement to interpret data. Supporters of artificial intelligence being used in financial audits have claimed that increased risks from instances of higher data interpretation can be minimized through such technologies. One necessary element of an audit of financial statements that requires professional judgement is the implementation of thresholds for materiality. Materiality entails the distinction between errors and transactions in financial statements that would impact decisions made by users of those financial statements. The threshold for materiality in an audit is set by the auditor based on various factors. Artificial intelligence has been used to interpret data and suggest materiality thresholds to be implemented through the use of expert systems. === Decreased costs === Those in favor of using artificial intelligence to complete investigations of fraud have stated that such technologies decrease the amount of time required to complete tasks that are repetitive. The claim further states that such efficiencies allow for lowered resource requirements, which can then be further spent on tasks that have not been fully automated. The audit firm Ernst & Young has posited these claims by declaring that their deep learning systems have been used to reduce time spent on administrative tasks by analyzing relevant audit documents. According to the firm, this has allowed their employees to focus more on judgement and analysis. == Disadvantages == === Job Displacement === The inescapable reception of computer based intelligence and robotization advancements might prompt critical work relocation across different enterprises. As artificial intelligence frameworks become more equipped for performing undertakings customarily completed by people, there is a worry that specific work jobs could become out of date, prompting joblessness and financial imbalance. === Initial investment requirement === Along with a knowledge of coding and building systems through computer programs, we are seeing the advantages of these systems, but since they are so new, they require a large investment to start building such a system. Any firm that is planning on implementing an AI system to detect fraud must hire a team of data scientists, along with upgrading their cloud system and data storage. The system must be consistently monitored and updated to be the most efficient form of itself, otherwise the likelihood of fraud being involved in those transactions increases. If one does not initially invest in such a syst

    Read more →
  • Master/Session

    Master/Session

    In cryptography, Master/Session is a key management scheme in which a pre-shared Key Encrypting Key (called the "Master" key) is used to encrypt a randomly generated and insecurely communicated Working Key (called the "Session" key). The Working Key is then used for encrypting the data to be exchanged. Its advantage is simplicity, but it suffers the disadvantage of having to communicate the pre-shared Key Exchange Key, which can be difficult to update in the event of compromise. The Master/Session technique was created in the days before asymmetric techniques, such as Diffie-Hellman, were invented. This technique still finds widespread use in the financial industry, and is routinely used between corporate parties such as issuers, acquirers, switches. Its use in device communications (such as PIN pads), however, is in decline given the advantages of techniques such as DUKPT.

    Read more →
  • Payment tokenization

    Payment tokenization

    Payment tokenization is a data security process that replaces sensitive payment information, such as credit card numbers, with a unique identifier or "token." This token can be used in place of actual data during transactions but has no exploitable value if breached, thereby reducing the risk of data theft and fraud. == Overview == Payment tokenization is generally categorized into two types: security tokens and payment tokens. Security tokens, also known as post-authorization tokens, are used to replace sensitive information like Primary Account Numbers (PANs), such as credit card numbers either after a payment is authorized or for storing data securely (data-at-rest), such as in merchant databases. These models have been in use since the mid-2000s, following the introduction of the Payment Card Industry Data Security Standard in 2004, which established standards for safeguarding cardholder data. The Payment Card Industry Security Standards Council's 2011 Tokenization Guidelines and the proposed American National Standards Institute X9 standards emphasize using tokens primarily to secure sensitive information, not as replacements for payment credentials processed over financial networks. Traditionally, merchants stored PANs to support backend operations such as settlements, reconciliations, chargebacks, loyalty programs, and customer service. However, with the adoption of security tokenization, merchants can substitute PANs with tokens in their systems. This not only reduces their exposure to fraud but also helps minimize the scope and cost of PCI-DSS compliance, offering a more secure and efficient way to manage cardholder data. == Applications == Payment tokenization is widely used by mobile wallets such as Apple Pay, Google Pay, and Samsung Pay use tokenization to safely store card data on devices. E-commerce platforms rely on it to securely retain customer payment details for recurring purchases. At the physical point of sale, EMV-enabled systems use tokenization to protect card information during in-store transactions. Also, subscription billing services implement tokenization to manage and safeguard payment credentials for ongoing charges.

    Read more →
  • Cover-coding

    Cover-coding

    Cover-coding is a technique for obscuring the data that is transmitted over an insecure link, to reduce the risks of snooping. An example of cover-coding would be for the sender to perform a bitwise XOR (exclusive OR) of the original data with a password or random number which is known to both sender and receiver. The resulting cover-coded data is then transmitted from sender to the receiver, who uncovers the original data by performing a further bitwise XOR (exclusive OR) operation on the received data using the same password or random number. ISO 18000-6C (EPC Class 1 Generation 2) RFID tags protect some operations with a cover code. The reader requests a random number from the tag, and the tag responds with a new random number. The reader then encrypts future communications with this number, using bitwise XOR, to the data it sends. Cover coding is secure if the tag signal can't be intercepted and the random number is not re-used. Compared to the loud transmissions from the reader, tag backscatter is much weaker and difficult -- but not impossible -- to intercept.

    Read more →
  • Recursive transition network

    Recursive transition network

    A recursive transition network ("RTN") is a graph theoretical schematic used to represent the rules of a context-free grammar. RTNs have application to programming languages, natural language and lexical analysis. Any sentence that is constructed according to the rules of an RTN is said to be "well-formed". The structural elements of a well-formed sentence may also be well-formed sentences by themselves, or they may be simpler structures. This is why RTNs are described as recursive. == Notes and references ==

    Read more →
  • Data governance

    Data governance

    Data governance is a term used on both a macro and a micro level. The former is a political concept and forms part of international relations and Internet governance; the latter is a data management concept and forms part of corporate/organizational data governance. Data governance involves delegating authority over data and exercising that authority through decision-making processes. It plays a role in enhancing the value of data assets. == Macro level == Data governance at the macro level involves regulating cross-border data flows among countries, which is more precisely termed international data governance. This field was first formed in the early 2000s, and consists of "norms, principles and rules governing various types of data." There have been several international groups established by research organizations that aim to grant access to their data. These groups that enable an exchange of data are, as a result, exposed to domestic and international legal interpretations that ultimately decide how data is used. However, as of 2023, there are no international laws or agreements specifically focused on data protection. == Data governance (Data Management) == Data governance is the set of principles, policies, and processes that guide the effective and responsible use of data within an organization. It creates a framework for decision making, accountability, and oversight across the data lifecycle, from creation and storage to sharing and disposal. Data governance is closely linked with data management, which provides the practical methods to carry out governance objectives. These methods include data quality assurance, metadata management, master data management, security controls, and compliance monitoring. Together, governance and management aim to maximize the value of data as a strategic asset, reduce risks from misuse or inaccuracy, and ensure compliance with regulatory, ethical, and business requirements. The importance of this discipline has grown with the rise of big data, cloud computing, and artificial intelligence, where consistent standards and stewardship are essential for privacy protection, interoperability, and informed decision making. == Data governance drivers == While data governance initiatives can be driven by a desire to improve data quality, they are often driven by C-level leaders responding to external regulations. In a recent report conducted by the CIO WaterCooler community, 54% stated the key driver was efficiencies in processes; 39% - regulatory requirements; and only 7% customer service. Examples of these regulations include Sarbanes–Oxley Act, Basel I, Basel II, HIPAA, GDPR, cGMP, and a number of data privacy regulations. To achieve compliance with these regulations, business processes and controls require formal management processes to govern the data subject to these regulations. Successful programs identify drivers that are meaningful to both supervisory and executive leadership. Common themes among the external regulations center on the need to manage risk. The risks can be financial misstatement, inadvertent release of sensitive data, or poor data quality for key decisions. Methods to manage these risks vary from industry to industry. Examples of commonly referenced best practices and guidelines include COBIT, ISO/IEC 38500, and others. The proliferation of regulations and standards creates challenges for data governance professionals, particularly when multiple regulations overlap the data being managed. Organizations often launch data governance initiatives to address these challenges. == Data governance initiatives (Dimensions) == Data governance initiatives improve the quality of data by assigning a team responsible for data's accuracy, completeness, consistency, timeliness, validity, and uniqueness. This team usually consists of executive leadership, project management, line-of-business managers, and data stewards. The team usually employs a methodology for tracking and improving enterprise data, such as Six Sigma, and tools for data mapping, profiling, cleansing, and monitoring data. Data governance initiatives may be aimed at achieving a number of objectives including offering better visibility to internal and external customers (such as supply chain management), compliance with regulatory law, improving operations after rapid company growth or corporate mergers, or to aid the efficiency of enterprise knowledge workers by reducing confusion and error and increasing their scope of knowledge. Many data governance initiatives are also inspired by past attempts to fix information quality at the departmental level, which can lead to incongruent and redundant data quality processes. Most large companies have many applications and databases that can not easily share information. Therefore, knowledge workers within large organizations may not have access to the data they need to best do their jobs. When they do have access to the data, the data quality may be poor. By setting up a data governance practice or corporate data authority (individual or area responsible for determining how to proceed, in the best interest of the business, when a data issue arises), these problems can be mitigated. == Implementation == Implementation of a data governance initiative may vary in scope as well as origin. Sometimes, an executive mandate will arise to initiate an enterprise-wide effort. Sometimes the mandate will be to create a pilot project or projects, limited in scope and objectives, aimed at either resolving existing issues or demonstrating value. Sometimes, an initiative originates from lower down in the organization's hierarchy and will be deployed in a limited scope to demonstrate value to potential sponsors higher up in the organization. The initial scope of an implementation can vary greatly as well, from review of a one-off IT system to a cross-organization initiative. == Data governance tools == Leaders of successful data governance programs declared at the Data Governance Conference in Orlando, FL, in December 2006, that data governance is about 80 to 95 percent communication. That stated, it is a given that many of the objectives of a data governance program must be accomplished with appropriate tools. Many vendors are now positioning their products as data governance tools. Due to the different focus areas of various data governance initiatives, a given tool may or may not be appropriate. Additionally, many tools that are not marketed as governance tools address governance needs and demands.

    Read more →
  • Social knowledge management

    Social knowledge management

    Social knowledge management is a business approach that aims to leverage the collective intelligence and social interactions of an organization’s members and stakeholders. It is a branch of knowledge management, which is a multidisciplinary field that deals with the creation, sharing, and use of knowledge in various domains, such as business, economics, psychology, and information management. Knowledge management seeks to enhance organizational performance, innovation, and competitiveness by managing the intangible assets of an organization, such as human capital, know-how, technology, customers, and networks. Social media plays a crucial role in social knowledge management by enhancing communication, collaboration, and learning among individuals and groups, both internally and externally. It offers valuable insights and feedback from customers, partners, and stakeholders, and aids in generating and disseminating new knowledge. In a business context, social media is utilized for various purposes, including sentiment analysis, social learning, social collaboration, and social knowledge management. Social knowledge management is one of the application areas of social media in a business context next to others like sentiment analysis, social learning or social collaboration. Social media use by businesses can strive to achieve the following things from social media strategy point of view: learn, listen, engage in conversation, measure and refine, develop capabilities, define activities, prioritize objectives etc. Social media are not only transforming private communication and interaction, they also will transform how people work. With social media knowledge work in organizations can be optimized extremely: like a better distribution sharing and access to knowledge. This will be more and more important, as in today's business world, speed and complexity increase dramatically, while work environments change constantly. == Examples of Social KM platforms == Elium, a European software application which combines social tagging, bookmarking and networking paradigms to address internal information management purposes. Sciomino was a startup enterprise social network for Social Knowledge Management.

    Read more →