AI For Business Specialization

AI For Business Specialization — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Perceptual robotics

    Perceptual robotics

    Perceptual robotics is an interdisciplinary science linking Robotics and Neuroscience. It investigates biologically motivated robot control strategies, concentrating on perceptual rather than cognitive processes and thereby sides with J. J. Gibson's view against the Poverty of the stimulus theory. As a working definition, the following quote from Chapter 64 by H. Bülthoff, C. Wallraven and M. Giese from The Springer Handbook of Robotics, edited by Bruno Siciliano and Oussama Khatib, published by Springer in 2007, could be used: In the following we will apply the term Perceptual Robotics to signify the design of robots based on principles that are derived from human perception on all three levels in the sense of Marr. This includes a realization in terms of specific neural circuits as well as the transfer of more abstract biologically-inspired strategies for the solution of relevant computational problems.

    Read more →
  • Brain Imaging Data Structure

    Brain Imaging Data Structure

    The Brain Imaging Data Structure (BIDS) is a standard for organizing, annotating, and describing data collected during neuroimaging experiments. It is based on a formalized file and directory structure and metadata files (based on JSON and TSV) with controlled vocabulary. This standard has been adopted by a multitude of labs around the world as well as databases such as OpenNeuro, SchizConnect, Developing Human Connectome Project, and FCP-INDI, and is seeing uptake in an increasing number of studies. While originally specified for MRI data, BIDS has been extended to several other imaging modalities such as MEG, EEG, and intracranial EEG (see also BIDS Extension Proposals). == History == The project is a community-driven effort. BIDS, originally OBIDS (Open Brain Imaging Data Structure), was initiated during an INCF sponsored data sharing working group meeting (January 2015) at Stanford University. It was subsequently spearheaded and maintained by Chris Gorgolewski. Since October 2019, the project is headed by a Steering Group and maintained by a separate team of maintainers, the Maintainers Group, according to a governance document that was approved of by the BIDS community in a vote. BIDS has advanced under the direction and effort of contributors, the community of researchers that appreciate the value of standardizing neuroimaging data to facilitate sharing and analysis. == BIDS Extension Proposals == BIDS can be extended in a backwards compatible way and is evolving over time. This is accomplished through BIDS Extension Proposals (BEPs), which are community-driven processes following agreed-upon guidelines. A full list of finalized BEPs and BEPs in progress can be found on the BIDS website

    Read more →
  • Business intelligence

    Business intelligence

    Business intelligence (BI) consists of strategies, methodologies, and technologies used by enterprises for data analysis and management of business information to inform business strategies and business operations. Common functions of BI technologies include reporting, online analytical processing, analytics, dashboard development, data mining, process mining, complex event processing, business performance management, benchmarking, text mining, predictive analytics, and prescriptive analytics. BI tools can handle large amounts of structured and sometimes unstructured data to help organizations identify, develop, and otherwise create new strategic business opportunities. They aim to allow for the easy interpretation of these big data. Identifying new opportunities and implementing an effective strategy based on insights is assumed to potentially provide businesses with a competitive market advantage and long-term stability, and help them take strategic decisions. Business intelligence can be used by enterprises to support a wide range of business decisions ranging from operational to strategic. Basic operating decisions include product positioning or pricing. Strategic business decisions involve priorities, goals, and directions at the broadest level. In all cases, business intelligence is considered most effective when it combines data from the market in which a company operates (external data) with data from internal company sources, such as financial and operational information. When integrated, external and internal data provide a comprehensive view that creates ‘intelligence’ not possible from any single data source alone. Among their many uses, business intelligence tools empower organizations to gain insight into new markets, to assess demand and suitability of products and services for different market segments, and to gauge the impact of marketing efforts. BI applications use data gathered from a data warehouse (DW) or from a data mart, and the concepts of BI and DW combine as "BI/DW" or as "BIDW". A data warehouse contains a copy of analytical data that facilitates decision support. == History == The earliest known use of the term business intelligence is in Richard Millar Devens' Cyclopædia of Commercial and Business Anecdotes (1865). Devens used the term to describe how the banker Sir Henry Furnese gained profit by receiving and acting upon information about his environment, prior to his competitors: Throughout Holland, Flanders, France, and Germany, he maintained a complete and perfect train of business intelligence. The news of the many battles fought was thus received first by him, and the fall of Namur added to his profits, owing to his early receipt of the news. The ability to collect and react accordingly based on the information retrieved, Devens says, is central to business intelligence. When Hans Peter Luhn, a researcher at IBM, used the term business intelligence in an article published in 1958, he employed the Webster's Dictionary definition of intelligence: "the ability to apprehend the interrelationships of presented facts in such a way as to guide action towards a desired goal." In 1989, Howard Dresner (later a Gartner analyst) proposed business intelligence as an umbrella term to describe "concepts and methods to improve business decision making by using fact-based support systems." It was not until the late 1990s that this usage was widespread. == Definition == According to Solomon Negash and Paul Gray, business intelligence (BI) can be defined as systems that combine: Data gathering Data storage Knowledge management with analysis to evaluate complex corporate and competitive information for presentation to planners and decision makers, with the objective of improving the timeliness and the quality of the input to the decision process." According to Forrester Research, business intelligence is "a set of methodologies, processes, architectures, and technologies that transform raw data into meaningful and useful information used to enable more effective strategic, tactical, and operational insights and decision-making." Under this definition, business intelligence encompasses information management (data integration, data quality, data warehousing, master-data management, text- and content-analytics, et al.). Therefore, Forrester refers to data preparation and data usage as two separate but closely linked segments of the business-intelligence architectural stack. Some elements of business intelligence are: Multidimensional aggregation and allocation Denormalization, tagging, and standardization Realtime reporting with analytical alert A method of interfacing with unstructured data sources Group consolidation, budgeting, and rolling forecasts Statistical inference and probabilistic simulation Key performance indicators optimization Version control and process management Open item management Forrester distinguishes this from the business-intelligence market, which is "just the top layers of the BI architectural stack, such as reporting, analytics, and dashboards." === Compared with competitive intelligence === Though the term business intelligence is sometimes a synonym for competitive intelligence (because they both support decision making), BI uses technologies, processes, and applications to analyze mostly internal, structured data and business processes while competitive intelligence gathers, analyzes, and disseminates information with a topical focus on company competitors. If understood broadly, competitive intelligence can be considered as a subset of business intelligence. === Compared with business analytics === Business intelligence and business analytics are sometimes used interchangeably, but there are alternate definitions. Thomas Davenport, professor of information technology and management at Babson College argues that business intelligence should be divided into querying, reporting, Online analytical processing (OLAP), an "alerts" tool, and business analytics. In this definition, business analytics is the subset of BI focusing on statistics, prediction, and optimization, rather than the reporting functionality. == Unstructured data == Business operations can generate a very large amount of data in the form of emails, memos, notes from call centers, news, user groups, chats, reports, web pages, presentations, image files, video files, and marketing material. According to Merrill Lynch, more than 85% of all business information exists in these forms; a company might only use such a document a single time. Because of the way it is produced and stored, this information is either unstructured or semi-structured. The management of semi-structured data is an unsolved problem in the information technology industry. According to projections from Gartner (2003), white-collar workers spend 30–40% of their time searching, finding, and assessing unstructured data. BI uses both structured and unstructured data. The former is easy to search, and the latter contains a large quantity of the information needed for analysis and decision-making. Because of the difficulty of properly searching, finding, and assessing unstructured or semi-structured data, organizations may not draw upon these vast reservoirs of information, which could influence a particular decision, task, or project. This can ultimately lead to poorly informed decision-making. Therefore, when designing a business intelligence/DW solution, the specific problems associated with semi-structured and unstructured data must be accommodated, as well as those associated with structured data. === Limitations of semi-structured and unstructured data === There are several challenges to developing BI with semi-structured data. According to Inmon & Nesavich, some of those are: Physically accessing unstructured textual data – unstructured data is stored in a huge variety of formats. Terminology – Among researchers and analysts, there is a need to develop standardized terminology. Volume of data – As stated earlier, up to 85% of all data exists as semi-structured data. Couple that with the need for word-to-word and semantic analysis. Searchability of unstructured textual data – A simple search on some data, e.g. apple, results in links where there is a reference to that precise search term. (Inmon & Nesavich, 2008) gives an example: "a search is made on the term felony. In a simple search, the term felony is used, and everywhere there is a reference to felony, a hit to an unstructured document is made. But a simple search is crude. It does not find references to crime, arson, murder, embezzlement, vehicular homicide, and such, even though these crimes are types of felonies". === Metadata === To solve problems with searchability and assessment of data, it is necessary to know something about the content. This can be done by adding context through the use of metadata. Many systems already capture some metadata (e.g. filename, author, size, etc.), but more usef

    Read more →
  • Personal network

    Personal network

    A personal network is a set of human contacts known to an individual, with whom that individual would expect to interact at intervals to support a given set of activities. In other words, a personal network is a group of caring, dedicated people who are committed to maintain a relationship with a person in order to support a given set of activities. Having a strong personal network requires being connected to a network of resources for mutual development and growth. Personal networks can be understood by: who knows you what you know about them what they know about you what are you learning together how you work at that Personal networks are intended to be mutually beneficial, extending the concept of teamwork beyond the immediate peer group. The term is usually encountered in the workplace, though it could apply equally to other pursuits outside work. Personal networking is the practice of developing and maintaining a personal network, which is usually undertaken over an extended period. The concept is related to business networking and is often encouraged by large organizations, in the hope of improving productivity, and so a number of tools exist to support the maintenance of networks. Many of these tools are IT-based, and use Web 2.0 technologies. == History of networking and business success == In the second half of the twentieth century, U.S. advocates for workplace equity popularized the term and concept of networking as part of a larger social capital lexicon—which also includes terms such as glass ceiling, role model, mentoring, and gatekeeper—serving to identify and address the problems barring non-dominant groups from professional success. Mainstream business literature subsequently adopted the terms and concepts, promoting them as pathways to success for all career climbers. In 1970 these terms were not in the general American vocabulary; by the mid-1990s they had become part of everyday speech. Before the mid-twentieth century, what we call networking today was framed in the language of family and friendship. These close personal relationships provided a range of opportunities to preferred subsets of people, such as access to job opportunities, information, credit, and partnerships. Family networks and nepotism have proven particularly strong throughout history. However, other common bonds—from ethnicity and religion to school ties and club memberships—can connect subsets of people as well. Of course people whom insiders consider undesirable have been barred from such networks, with important consequences. Those who tap into influential networks can be nurtured toward success. Those who are shut out from networks can lose hope of success. Numerous business heroes of the past—such as Benjamin Franklin, Andrew Carnegie, Henry Ford, and John D. Rockefeller—exploited networks to great effect. The business networks that seemed natural and transparent to these white men were a closed book to women and minorities for much of American history. Drawing on work from the social sciences, these outsider groups had to identify and then harness the mechanisms behind networking's power. A prominent early example of this process was the formation of corporate caucuses by black men at Xerox starting in 1969. Groups of black salesmen met regularly to share information about Xerox's culture and strategies for navigating it most effectively. Through confrontation and collaboration with a relatively accommodating upper management, the caucuses helped open opportunities for high-performing black employees. The popular and business press began using the terms "network" and "networking" in the mid-1970s in the context of businesswomen consciously pursuing this strategy. Authors encouraged female workers to recognize and exploit the informal workplace systems that provided advancement. They urged women to identify mentors, use social contacts, and build peer and authority networks. The push for networking drew on ideas and relationships from the era's feminist movement, and dictionaries of the time explicitly linked business networking to women's efforts to succeed in the workplace. Since the closing decades of the twentieth century, networking has become a pervasive term and concept in American society. People now invoke networking in relation to everything from business to child rearing to science. While ambitious careerists seek networks as an indispensable talisman, companies purposefully encourage networking among their employees to boost performance and gain competitive advantage. At the same time, Americans are forgetting the workplace activism that first illuminated the power of networking. Unfortunately, this loss of historical context can fuel a backlash against outsider groups who still seek to synthesize networks so they can access the same opportunities enjoyed by insiders. == Characteristics of networks == Broadly speaking, all networks have the following characteristics: Purpose – A network can be established for learning, mission, business, idea, and family or personal reasons. Structure – A network is a group of interlinked entities that form a cluster. Most social structures tend to be characterized by dense clusters of strong connections. Style – The place, space, pace and style of interaction of the networks give an understanding of the style of the networks. Namkee Park, Seungyoon Lee and Jang Hyun Kim examined the relations between personal network characteristics and Facebook use. According to their study, personal networks are investigated through several structural characteristics, which can be categorized into three major dimensions according to the level of analysis: Dyadic tie attributes which include the characteristics of ego-alter ties such as duration, multiplexity, and proximity. Ego-alter tie attributes represent various dimensions of relationships between the focal person and their close contacts. First, tie duration refers to the length of time since the tie was originally initiated, which indicates the duration of relationships. Second, multiplexity includes a focal individual's degree of involvement in various types of interactions with network members. The third dimension is the physical proximity between ego and alter. Theories of proximity suggest that physical proximity between people affects their interaction and subsequently, their formation of network ties. The characteristics of alter-alter ties including personal network density. When moving to ties at the alter-alter level, ego-network density, which refers to the extent to which one's alters are connected with each other, is an important dimension of personal networks. Dense personal network structure indicates close interpersonal contacts among alters, and consequently, is considered to promote the sharing of resources. On the other hand, loose connections, or structural holes in ego-networks, have been found to facilitate the flow of information and to provide advantages in searching and obtaining resources (e.g., getting a job). The composition of alter attributes centered on the heterogeneity of alters in one's personal network. The heterogeneity of alters in one's personal network is associated with access to diverse resources and information It is expected, thus, that the heterogeneity attributes may enhance the focal actor's social activities. Each of these characteristics represents unique aspects of individuals' network relationships. == Types of personal networks == Personal networks can be used for two main reasons: social and professional. In 2012, LinkedIn along with TNS conducted a survey of 6,000 social network users to understand the difference between personal social networks and personal professional networks. The "Mindset Divide" of users of these networks was compared as follows: Emotions: Personal social networks: Nostalgia, fun, distraction. Personal professional networks: Achievement, success, aspiration. Use: Personal social networks: Users are in a casual mindset often just passing time. They use social networks to socialize, stay in touch, be entertained and kill time. Personal professional networks: In this purposeful mindset, users invest time to improve themselves and their future. These networks are used to maintain professional identity, make useful contacts, search for opportunities and stay in touch. Content: Personal professional networks: These provide information about career, brand updates and current affairs. Professional development: Personal development networks: These provide access to those who can provide information, knowledge, advice, support, expertise, guidance, and concrete resources to learn and work effectively—thus those who support the continuing professional development. == Personal network management == Personal network management (PNM) is a crucial aspect of personal information management and can be understood as the practice of managing the links and connections for social and profession

    Read more →
  • Phase correlation

    Phase correlation

    Phase correlation is an approach to estimate the relative translative offset between two similar images (digital image correlation) or other data sets. It is commonly used in image registration and relies on a frequency-domain representation of the data, usually calculated by fast Fourier transforms. The term is applied particularly to a subset of cross-correlation techniques that isolate the phase information from the Fourier-space representation of the cross-correlogram. == Example == The following image demonstrates the usage of phase correlation to determine relative translative movement between two images corrupted by independent Gaussian noise. The image was translated by (20,23) pixels. Accordingly, one can clearly see a peak in the phase-correlation representation at approximately (20,23). == Method == Given two input images g a {\displaystyle \ g_{a}} and g b {\displaystyle \ g_{b}} : Apply a window function (e.g., a Hamming window) on both images to reduce edge effects (this may be optional depending on the image characteristics). Then, calculate the discrete 2D Fourier transform of both images. G a = F { g a } , G b = F { g b } {\displaystyle \ \mathbf {G} _{a}={\mathcal {F}}\{g_{a}\},\;\mathbf {G} _{b}={\mathcal {F}}\{g_{b}\}} Calculate the cross-power spectrum by taking the complex conjugate of the second result, multiplying the Fourier transforms together elementwise, and normalizing this product elementwise. R = G a ∘ G b ∗ | G a ∘ G b ∗ | {\displaystyle \ R={\frac {\mathbf {G} _{a}\circ \mathbf {G} _{b}^{}}{|\mathbf {G} _{a}\circ \mathbf {G} _{b}^{}|}}} Where ∘ {\displaystyle \circ } is the Hadamard product (entry-wise product) and the absolute values are taken entry-wise as well. Written out entry-wise for element index ( j , k ) {\displaystyle (j,k)} : R j k = G a , j k ⋅ G b , j k ∗ | G a , j k ⋅ G b , j k ∗ | {\displaystyle \ R_{jk}={\frac {G_{a,jk}\cdot G_{b,jk}^{}}{|G_{a,jk}\cdot G_{b,jk}^{}|}}} Obtain the normalized cross-correlation by applying the inverse Fourier transform. r = F − 1 { R } {\displaystyle \ r={\mathcal {F}}^{-1}\{R\}} Determine the location of the peak in r {\displaystyle \ r} . ( Δ x , Δ y ) = arg ⁡ max ( x , y ) { r } {\displaystyle \ (\Delta x,\Delta y)=\arg \max _{(x,y)}\{r\}} === Subpixel registration === Commonly, interpolation methods are used to estimate the peak location in the cross-correlogram to non-integer values, despite the fact that the data are discrete, and this procedure is often termed 'subpixel registration'. A large variety of subpixel interpolation methods are given in the technical literature. Common peak interpolation methods such as parabolic interpolation have been used, and the OpenCV computer vision package uses a centroid-based method, though these generally have inferior accuracy compared to more sophisticated methods. Because the Fourier representation of the data has already been computed, it is especially convenient to use the Fourier shift theorem with real-valued (sub-integer) shifts for this purpose, which essentially interpolates using the sinusoidal basis functions of the Fourier transform. An especially popular FT-based estimator is given by Foroosh et al. In this method, the subpixel peak location is approximated by a simple formula involving peak pixel value and the values of its nearest neighbors, where r ( 0 , 0 ) {\displaystyle r_{(0,0)}} is the peak value and r ( 1 , 0 ) {\displaystyle r_{(1,0)}} is the nearest neighbor in the x direction (assuming, as in most approaches, that the integer shift has already been found and the comparand images differ only by a subpixel shift). Δ x = r ( 1 , 0 ) r ( 1 , 0 ) ± r ( 0 , 0 ) {\displaystyle \ \Delta x={\frac {r_{(1,0)}}{r_{(1,0)}\pm r_{(0,0)}}}} The Foroosh et al. method is quite fast compared to most methods, though it is not always the most accurate. Some methods shift the peak in Fourier space and apply non-linear optimization to maximize the correlogram peak, but these tend to be very slow since they must apply an inverse Fourier transform or its equivalent in the objective function. It is also possible to infer the peak location from phase characteristics in Fourier space without the inverse transformation, as noted by Stone. These methods usually use a linear least squares (LLS) fit of the phase angles to a planar model. The long latency of the phase angle computation in these methods is a disadvantage, but the speed can sometimes be comparable to the Foroosh et al. method depending on the image size. They often compare favorably in speed to the multiple iterations of extremely slow objective functions in iterative non-linear methods. Since all subpixel shift computation methods are fundamentally interpolative, the performance of a particular method depends on how well the underlying data conform to the assumptions in the interpolator. This fact also may limit the usefulness of high numerical accuracy in an algorithm, since the uncertainty due to interpolation method choice may be larger than any numerical or approximation error in the particular method. Subpixel methods are also particularly sensitive to noise in the images, and the utility of a particular algorithm is distinguished not only by its speed and accuracy but its resilience to the particular types of noise in the application. == Rationale == The method is based on the Fourier shift theorem. Let the two images g a {\displaystyle \ g_{a}} and g b {\displaystyle \ g_{b}} be circularly-shifted versions of each other: g b ( x , y ) = d e f g a ( ( x − Δ x ) mod M , ( y − Δ y ) mod N ) {\displaystyle \ g_{b}(x,y)\ {\stackrel {\mathrm {def} }{=}}\ g_{a}((x-\Delta x){\bmod {M}},(y-\Delta y){\bmod {N}})} (where the images are M × N {\displaystyle \ M\times N} in size). Then, the discrete Fourier transforms of the images will be shifted relatively in phase: G b ( u , v ) = G a ( u , v ) e − 2 π i ( u Δ x M + v Δ y N ) {\displaystyle \mathbf {G} _{b}(u,v)=\mathbf {G} _{a}(u,v)e^{-2\pi i({\frac {u\Delta x}{M}}+{\frac {v\Delta y}{N}})}} One can then calculate the normalized cross-power spectrum to factor out the phase difference: R ( u , v ) = G a G b ∗ | G a G b ∗ | = G a G a ∗ e 2 π i ( u Δ x M + v Δ y N ) | G a G a ∗ e 2 π i ( u Δ x M + v Δ y N ) | = G a G a ∗ e 2 π i ( u Δ x M + v Δ y N ) | G a G a ∗ | = e 2 π i ( u Δ x M + v Δ y N ) {\displaystyle {\begin{aligned}R(u,v)&={\frac {\mathbf {G} _{a}\mathbf {G} _{b}^{}}{|\mathbf {G} _{a}\mathbf {G} _{b}^{}|}}\\&={\frac {\mathbf {G} _{a}\mathbf {G} _{a}^{}e^{2\pi i({\frac {u\Delta x}{M}}+{\frac {v\Delta y}{N}})}}{|\mathbf {G} _{a}\mathbf {G} _{a}^{}e^{2\pi i({\frac {u\Delta x}{M}}+{\frac {v\Delta y}{N}})}|}}\\&={\frac {\mathbf {G} _{a}\mathbf {G} _{a}^{}e^{2\pi i({\frac {u\Delta x}{M}}+{\frac {v\Delta y}{N}})}}{|\mathbf {G} _{a}\mathbf {G} _{a}^{}|}}\\&=e^{2\pi i({\frac {u\Delta x}{M}}+{\frac {v\Delta y}{N}})}\end{aligned}}} since the magnitude of an imaginary exponential always is one, and the phase of G a G a ∗ {\displaystyle \ \mathbf {G} _{a}\mathbf {G} _{a}^{}} always is zero. The inverse Fourier transform of a complex exponential is a Dirac delta function, i.e. a single peak: r ( x , y ) = δ ( x + Δ x , y + Δ y ) {\displaystyle \ r(x,y)=\delta (x+\Delta x,y+\Delta y)} This result could have been obtained by calculating the cross correlation directly. The advantage of this method is that the discrete Fourier transform and its inverse can be performed using the fast Fourier transform, which is much faster than correlation for large images. === Benefits === Unlike many spatial-domain algorithms, the phase correlation method is resilient to noise, occlusions, and other defects typical of medical or satellite images. The method can be extended to determine rotation and scaling differences between two images by first converting the images to log-polar coordinates. Due to properties of the Fourier transform, the rotation and scaling parameters can be determined in a manner invariant to translation. === Limitations === In practice, it is more likely that g b {\displaystyle \ g_{b}} will be a simple linear shift of g a {\displaystyle \ g_{a}} , rather than a circular shift as required by the explanation above. In such cases, r {\displaystyle \ r} will not be a simple delta function, which will reduce the performance of the method. In such cases, a window function (such as a Gaussian or Tukey window) should be employed during the Fourier transform to reduce edge effects, or the images should be zero padded so that the edge effects can be ignored. If the images consist of a flat background, with all detail situated away from the edges, then a linear shift will be equivalent to a circular shift, and the above derivation will hold exactly. The peak can be sharpened by using edge or vector correlation. For periodic images (such as a chessboard or picket fence), phase correlation may yield ambiguous results with several peaks in the resulting output. == Applications == Phase correlation is the preferred m

    Read more →
  • Cloud Data Management Interface

    Cloud Data Management Interface

    ISO/IEC 17826 Information technology — Cloud Data Management Interface (CDMI) Version 2.0.0 is an international standard that specifies a protocol for self-provisioning, administering and managing access to data stored in cloud storage, object storage, storage area network and network attached storage systems. The CDMI standard is developed and maintained by the Storage Networking Industry Association, who makes a publicly accessible version of the specification available. CDMI defines new resource representations to enable standardized management of any URI-accessible data, and defines RESTful HTTP operations using these representations to discover the capabilities of the storage system, discover stored data, access and update management metadata, specify data storage protocols (such as iSCSI and NFS) through which the stored data is accessed, and provide cross-system and cross-cloud import and export in order to enable data portability. Management functions enabled by CDMI include managing data ownership, identity mapping, access controls, user-specified metadata, and to declaratively specify desired data protection, data retention, constraints on geographic placement, desired quality of service, data versioning and security requirements. CDMI also defines utility services to facilitate data management, such the ability to query data matching specific criteria, and includes extensions to perform bulk updates using CDMI Jobs. == Capabilities == Compliant implementations must provide access to a set of configuration parameters known as capabilities. These are either boolean values that represent whether or not a system supports things such as queues, export via other protocols, path-based storage and so on, or numeric values expressing system limits, such as how much metadata may be placed on an object. As a minimal compliant implementation can be quite small, with few features, clients need to check the cloud storage system for a capability before attempting to use the functionality it represents. Resource allocation assignments limited to the data management interface protocols must possess access bypass capabilities which extend beyond the layered framework. This integral function is vital to the prevention of transport layer session hijacking by unauthorized entities which may circumvent standard interfacing security parameters. == Containers == A CDMI client may access objects, including containers, by either name or object id (OID), assuming the CDMI server supports both methods. When storing objects by name, it is natural to use nested named containers; the resulting structure corresponds exactly to a traditional filesystem directory structure. == Objects == Objects are similar to files in a traditional file system, but are enhanced with an increased amount and capacity for metadata. As with containers, they may be accessed by either name or OID. When accessed by name, clients use URLs that contain the full pathname of objects to create, read, update and delete them. When accessed by OID, the URL specifies an OID string in the cdmi-objectid container; this container presents a flat name space conformant with standard object storage system semantics. Subject to system limits, objects may be of any size or type and have arbitrary user-supplied metadata attached to them. Systems that support query allow arbitrary queries to be run against the metadata. == Domains, Users and Groups == CDMI supports the concept of a domain, similar in concept to a domain in the Windows Active Directory model. Users and groups created in a domain share a common administrative database and are known to each other on a "first name" basis, i.e. without reference to any other domain or system. Domains also function as containers for usage and billing summary data. == Access Control == CDMI exactly follows the ACL and ACE model used for file authorization operations by NFSv4. This makes it also compatible with Microsoft Windows systems. == Metadata == CDMI draws much of its metadata model from the XAM specification. Objects and containers have "storage system metadata", "data system metadata" and arbitrary user specified metadata, in addition to the metadata maintained by an ordinary filesystem (atime etc.). == Queries == CDMI specifies a way for systems to support arbitrary queries against CDMI containers, with a rich set of comparison operators, including support for regular expressions. == Queues == CDMI supports the concept of persistent FIFO (first-in, first-out) queues. These are useful for job scheduling, order processing and other tasks in which lists of things must be processed in order. == Compliance == Both retention intervals and retention holds are supported by CDMI. A retention interval consists of a start time and a retention period. During this time interval, objects are preserved as immutable and may not be deleted. A retention hold is usually placed on an object because of judicial action and has the same effect: objects may not be changed nor deleted until all holds placed on them are removed. == Billing == Summary information suitable for billing clients for on-demand services can be obtained by authorized users from systems that support it. == Serialization == Serialization of objects and containers allows export of all data and metadata on a system and importation of that data into another cloud system. == Foreign protocols == CDMI supports export of containers as NFS or CIFS shares. Clients that mount these shares see the container hierarchy as an ordinary filesystem directory hierarchy, and the objects in the containers as normal files. Metadata outside of ordinary filesystem metadata may or may not be exposed. Provisioning of iSCSI LUNs is also supported. == Client SDKs == CDMI Reference Implementation Droplet libcdmi-java libcdmi-python .NET SDK

    Read more →
  • Eduroam

    Eduroam

    eduroam (a portmanteau of education and roaming) is an international Wi-Fi internet access roaming service for users in research, higher education and further education. It provides researchers, teachers, and students network access when visiting an institution other than their own. Users are authenticated with credentials from their home institution, regardless of the location of the eduroam access point. Authorization to access the Internet and other resources are handled by the visited institution. Users do not have to pay to use eduroam. In some countries, Internet access via eduroam is also available at other locations than the participating institutions, e.g. in libraries, public buildings, railway stations, city centres and airports. It is also available at many primary and secondary education institutions in Brazil and the US. == History == The eduroam initiative started in 2002 when during the preparations for the creation of TERENA's task force TF-Mobility, Klaas Wierenga of SURFnet shared the idea of combining a RADIUS-based infrastructure with IEEE 802.1X technology to provide roaming network access across research and education networks. Initially, the service was joined by institutions in the Netherlands, Germany, Finland, Portugal, Croatia and the United Kingdom. Later, other NRENs in Europe embraced the idea and started joining the infrastructure, which was then called eduroam. Since 2004, the European Union co-funded further research and development work related to the eduroam service through the GN2 and GN3 projects. From September 2007, the European Union also funded through these projects the continued operation and maintenance of the eduroam service at the European level. The first non-European country to join eduroam was Australia, in December 2004. In Canada, eduroam started as an initiative of the University of British Columbia, which was later taken over by CANARIE as a service of its Canadian Access Federation. In the United States, eduroam was initially a pilot project between the National Science Foundation and the University of Tennessee (UTK). In 2012, Internet2 announced the addition of eduroam to its NET+ service offerings. AnyRoam LLC, a private company, was formed by former UTK staff to serve as an Internet2 active corporate member administering the US top-level servers. In 2021, Internet2 assumed direct management of the eduroam service for US-based organizations. == Technology == The eduroam service uses IEEE 802.1X as the authentication method and a hierarchical system of RADIUS servers. The hierarchy typically consists of RADIUS servers at the participating institutions, national RADIUS servers run by the National Roaming Operators, and regional top-level RADIUS servers for individual world regions. In some cases, institutions contact each other directly via DNS lookups () When a user visits a remote institution, the user's device presents their credentials to the local RADIUS server. That RADIUS server discovers that it is not responsible for the realm of the user's home institution and proxies the access request to another RADIUS server, typically the national RADIUS server. If the visited institution is in a different country than the home institution, the request is in turn proxied to the regional top-level RADIUS server, and then to the national RADIUS server of the user's home country. That national server forwards the credentials to the home institution, where they are verified. The RADIUS response travels back over the proxy-hierarchy to the visited institution and the user is granted access. In eduroam, the user credentials are always presented in the form of an EAP method (). The EAP method is responsible for ensuring that the users credentials are secure, and private. The users credentials can then travel via a number of intermediate servers, not under the control of the home institution of the user. This requirement limits the types of EAP methods that can be used. EAP methods which do not provide for security or privacy of user credentials cannot be used in eduroam. The most commonly used EAP methods in eduroam are EAP-TLS, PEAP, and EAP-TTLS. The methods used generally fall into two broad categories: those that use credentials in the form of some public-key mechanism with certificates and those that use so-called tunnelled authentication with "inner" passwords or other credentials. Most institutions use a tunnelled authentication method that requires a server certificate. These server certificates are used to set up a secure tunnel between the mobile device and the authentication server, through which the user credentials (e.g. name and password) are securely transported. A complication arises if the user's home institution does not use a two-letter country-code top-level domain as part of its realm, but a generic top-level domain such as .edu or .org. By inspection of such realms, it is not possible to determine which national RADIUS server the request should be routed to. Such domains will thus, by default, fail to work in international roaming. The workaround for this problem involves the creation of exceptions in the international RADIUS request routing tables; however, this workaround does not scale as the number of exception entries grows. Several solutions have been proposed to eliminate this workaround in the future, the most promising of which is RADIUS over TLS with Dynamic Discovery, which does not rely on static routing tables inside a RADIUS server configuration to route requests to their proper destination. Instead, the participating institution adds one NAPTR DNS resource record to its own domain's DNS zone, which states by which server eduroam authentication for the domain is handled. == Governance == GÉANT has established a lightweight global governance structure. Recognising the large variety in the organisation and funding of research and education (networking) in different countries and regions, rules imposed on the operations of eduroam are limited to technical and administrative requirements that are necessary to ensure the smooth and secure operations of eduroam worldwide. Moreover, the eduroam operators have the leading role in creating and maintaining the rules of the global eduroam governance. The Global eduroam Governance Committee (GeGC) has the central role in the global eduroam governance structure. While its structure has evolved over time, it presently has three representatives from each of five regions — mirroring those used by the Regional Internet registries — serving a two-year term. In addition, GÉANT may appoint one or more experts as non-voting members of the GeGC. == Geographical deployment == eduroam is available at selected locations in countries with a National Roaming Operator that has signed the eduroam Compliance Statement. Those sixty-seven countries are listed below. In addition, there may be pilot deployments in countries that are in the process of joining eduroam. === Middle East === eduroam is deployed in: === Europe === The NRENs that are members of the consortium of the GN3 project have joined the European eduroam confederation by signing the confederation's policy that requires its members to comply with a set of technical and organisational requirements, which are more specific than those in the global eduroam Compliance Statement. As a consequence, eduroam is deployed in the following countries: In addition, three NRENs that are associate members of the consortium of the GN3 project without voting rights joined the European eduroam confederation; they represent Belarus (UIIP), Moldova (RENAM) and Russia (Joint Supercomputer Center of the Russian Academy of Sciences). Finally, five NRENs not involved in the GN3 project joined the European eduroam confederation on a voluntary basis, enabling the deployment of the service in: The European top-level RADIUS servers are operated by SURFnet and Forskningsnettet. === Asia-Pacific === eduroam is deployed in the following countries and economies: The Asia-Pacific top-level RADIUS servers are operated by AARNet and by the University of Hong Kong. === North America === eduroam is deployed in: === Latin America === eduroam is deployed in: === Africa === eduroam is deployed in: The inter-African RADIUS servers are operated by West-African research and education network WACREN, the UbuntuNet Alliance and TENET.

    Read more →
  • Social network hosting service

    Social network hosting service

    A social network hosting service is a web hosting service that specifically hosts the user creation of web-based social networking services, alongside related applications. Such services are also known as vertical social networks due to the creation of SNSes which cater to specific user interests and niches; like larger, interest-agnostic SNSes, such niche networking services may also possess the ability to create increasingly niche groups of users. == List of social network hosting services == Federated Media Publishing's BigTent BroadVision Clearvale Ning Wall.fm

    Read more →
  • Anna Becker

    Anna Becker

    Anna Becker is an Israeli researcher known in the field of artificial intelligence and computer science within the financial field. == Early life and education == Becker was born in Russia and immigrated to Israel at 16 after graduating from a school in Moscow. At 17, she began her studies at Technion – Israel Institute of Technology. During her master's degree in computer science, she taught first-year students of the same course, and at 27, Becker completed her PhD in Computer Science and Artificial Intelligence. == Career == While pursuing her PhD, Becker resolved an NP-complete approximation algorithm that had been unresolved for over twenty years. This made her a recognized scholar in the field. After completing her PhD, she developed an approximation technique by a factor of two. This technique is widely used today in operating systems, database systems, and VLSI chip designs. She then founded and sold Strategy Runner, a fintech software. After this, she founded EndoTech, an algorithmic trading platform based on artificial intelligence and machine learning. EndoTech's trading strategies have been operating in live cryptocurrency markets since 2017. The platform's BTC Alpha strategy has reported an average annual return of 163% on fixed capital over eight years of live operation, with a maximum drawdown of 14% and a trade accuracy rate of approximately 83%. In 2026, EndoTech entered a partnership with Bit1 Exchange to make its BTC Alpha and ETH Alpha copy trading strategies accessible to retail investors with no minimum deposit requirement, through a full-custody model in which user funds remain in their own exchange wallets at all times.As of 2023, Becker is working on Fianchetto Fund, an AI-based investing analysis platform. Becker has also co-authored a book on Bayesian networks, which has been published widely in the field of computer science and artificial intelligence.

    Read more →
  • Social media as a public utility

    Social media as a public utility

    Social media as a public utility is a theory postulating that social networking sites (such as Meta - ie:Facebook & Instagram or Alphabet - ie: YouTube & Google, but also independent sites such as Twitter, Tumblr, Snapchat etc.) are essential public services that should be regulated by the government, in a manner similar to how electric and phone utilities are typically government regulated. It is based on the notion that social media platforms have monopoly power and broad social influence. == Background == === Definitions === Social media is defined as "a group of Internet-based applications that build on the ideological and technological foundations of Web 2.0, and that allow the creation and exchange of User Generated Content." Furthermore, the New Zealand Government of Internal Affairs describes it as "a set of online technologies, sites, and practices which are used to share opinions, experiences and perspectives. Fundamentally it is about the conversation. In contrast with traditional media, the nature of social media is to be highly interactive." Moreover, the term social media is described as online tools that let people interact and communicate with each other. This has become a standard word for online cultural exchange and a dominant way for individuals to engage on the internet. By using social media individuals become more closely and strongly connected than ever before. The traditional definition of the term public utility is "an infrastructural necessity for the general public where the supply conditions are such that the public may not be provided with a reasonable service at reasonable prices because of monopoly in the area." Conventional public utilities include water, natural gas, and electricity. In order to secure the interests of the public, utilities are regulated. Public utilities can also be seen as natural monopolies implying that the highest degree of efficiency is accomplished under one operator in the marketplace. Public utility regulation for social media has been largely criticized because people believe it would produce undesirable and indirect effects. However, others say that truly effective government regulation would produce valuable results. Social media as a public utility is a crucial debate because utilities get regulated, so marking social media websites as utilities would require government regulation of various social media websites and platforms such as Facebook, Google, and Twitter. Applying the term public utility to social media implies that social media websites are public necessities, and, consequently, should be regulated by the government. While social media are not as essential for survival as traditional public utilities such as electricity, water, and natural gas, many people believe it has become vital for living in an interconnected world and without it, living a successful life would be difficult. Therefore, many people believe that social media has reached utility status and should be treated as a public utility. However, others believe that this is not true because social media are constantly revolutionizing and giving such platforms "utility status" would result in government regulation, which would consequently hinder innovation. Over the past decade many have debated and questioned whether or not "Internet service providers should be considered essential facilities or natural monopolies and regulated as public utilities." === Monopoly === A monopoly is defined as "a firm that is the only seller of a product or service having no close substitutes." A natural monopoly is when the entire demand within a relevant market can be satisfied at lowest cost by one firm rather than by two or more, and if such a market contains more than one firm then the firms will "quickly shake down to one through mergers or failures, or production will continue to consume more resources than necessary." In a monopoly competition is said to be short-lived, and in a natural monopoly it is said to produce inefficient results." Public utility companies can be regulated to prevent them from gaining monopolistic control. In November 2011 AT&T's proposal for merging with T-Mobile was rejected because it would have "diminished competition," and have led to the company having monopolistic power within the telephone industry. Such regulation is permitted because the telephone industry is a public utility. Similarly, Microsoft has also been prevented from taking various business actions that could result in the company gaining monopolistic power. If social media were a public utility then regulation of Google and Facebook would similarly dictate what they could and could not do. The possibility was raised in 2018 by U.S. Representative Steve King during a House Judiciary hearing on social media filtering practices. == Arguments == Advocates of this theory believe that social media websites already act like public utilities, and therefore regulation is needed. Additionally, advocates say that in the 21st century, using such websites are as necessary for communication as using traditional public utilities such as telephone, water, electricity, and natural gas are for other everyday uses. Specifically, advocates note that Google search should be treated as a public utility and needs to be regulated because it dominates the search engine market and no website can afford to ignore it. There is the position that a social media website such as Google "is a common carrier and should be regulated as such (Newman 2011)." These are reinforced by a perception that social media companies fail to properly maintain fair platforms for discourse. === Individual level === Advocates of regulating social media as a public utility believe that having an Internet presence using social media websites is imperative for individuals to adequately take part in the 21st century. Consequently, they argue that these sites are public utilities that need to be regulated to ensure that the constitutional rights of users are protected. For example, regulation may be needed to protect freedom of speech against risks such as Internet censorship and deplatforming. Social media affects people's behavior. For instance, it plays an important role in shaping its users' decisions and actions pertaining to health. This is demonstrated in a Pew Research Center research, which showed that 72 percent of American adults turned to social media for health information in 2011. Around 70 percent of people with chronic illnesses also use the platform to find cure, diagnoses, and other health answers. This development becomes a public issue as social media are likely to provide wrong medical information. Additionally, social media sites can also facilitate deleterious health behavior such as smoking, drug use, and harmful sexual behavior. === Business level === Advocates of social media as a public utility maintain that social media services dominate the Internet and are mainly owned by three or four companies that have unparalleled power to shape user interaction, and because of this power such businesses need to be regulated as public utilities. Zeynep Tufekci, University of North Carolina Chapel Hill, claims that services on the Internet such as Google, eBay, Facebook, Amazon.com, are all natural monopolies. She has stated that these services "benefit greatly from network externalities[,] which means that the more people on the service, the more useful it is for everyone," and thus it is difficult to replace the market leader. === Government level === Advocates of social media as a public utility believe that the government should impose restrictions on social media websites, such as Google, that are designed to benefit its rivals. Due to the recent substantial growth of social media websites such as Google, advocates claim that such a website "might need search neutrality regulation modeled after net neutrality regulation and that a Federal Search Commission might be needed to enforce such a regime." danah boyd expresses a future issue which the government may have to deal with in her research: Facebook is becoming an international social media website, specifically prevalent in Canada and Europe which are "two regions that love to regulate their utilities." Furthermore, recent books by New America Foundation Senior Fellow Rebecca MacKinnon and law professor Lori Andrews advise society to start considering Facebook and Google as nation-states or the "sovereigns of cyberspace." Overall, advocates of social media as a public utility believe that due to the immense popularity and necessity of social media websites, it is imperative that the Government imposes regulations in the same manner they do for electricity, water, and natural gas. == Counterarguments == Opponents of this theory say that social media websites should not be treated as public utilities because these platforms are changing every year, and because they are not essential services for s

    Read more →
  • Data security

    Data security

    Data security or data protection is the process of securing digital information to protect it from online threats. Data security or protection means protecting digital data, such as those in a database, from destructive forces and from the unwanted actions of unauthorized users, such as a cyberattack or a data breach. Data security protects computer hardware, software, storage devices, and the data of user devices. Data security also protects the data of organizations, companies and administrative controls. Data security guarantees the protection of individual data, such as identity documents and bank data, and protects against unauthorized access, theft and loss of individual data. Data security also protects data breaches that occurs in companies and industries. Good security measures in industries reduce the probability of data breaches, and employees can rely on the company with their data and private information to be kept secured while companies can continue to maintain a stable reputation. The CIA Triad (Confidentiality, Integrity, and Availability) is what is used to practice what an information security is required to follow. Confidentiality, protects information from being accessed by unauthorized persons. Integrity, makes sure data is trustworthy; and Availability, meaning that data can be accessed by approved users when it is needed; are three goals for data security. Non-repudiation in data security definition, is a device/service that shows where the data originated from and the proof of integrity. == Technologies == === Disk encryption === Disk encryption refers to encryption technology that encrypts data on a hard disk drive. It takes data from a storage device and coverts it into an unreadable format. Disk encryption typically takes form in either software (see disk encryption software) or hardware (see disk encryption hardware) which can be used together. Disk encryption is often referred to as on-the-fly encryption (OTFE) or transparent encryption. Full disk encryption encrypts each individual sector of a disk volume. Files and user data are encrypted to hinder unauthorized users from accessing without a decryption key. A diversifier permits a plaintext of a specific disk sector to be encrypted into different ciphertexts, which does not require additional storage, such as an initialization vector (IV) or message authentication code (MAC). === Software versus hardware-based mechanisms for protecting data === Software-based security solutions encrypt the data to protect it from theft. However, a malicious program or a hacker could corrupt the data to make it unrecoverable, making the system unusable. Hardware-based security solutions prevent read and write access to data, which provides very strong protection against tampering and unauthorized access. Hardware-based security or assisted computer security offers an alternative to software-only computer security. Security tokens such as those using PKCS#11 or a mobile phone may be more secure due to the physical access required in order to be compromised. Access is enabled only when the token is connected and the correct PIN is entered (see two-factor authentication). However, dongles can be used by anyone who can gain physical access to it. Newer technologies in hardware-based security solve this problem by offering full proof of security for data. Working off hardware-based security: A hardware device allows a user to log in, log out and set different levels through manual actions. Many devices use biometric technology to prevent malicious users from logging in, logging out, and changing privilege levels. The current state of a user of the device is read by controllers in peripheral devices such as hard disks. Illegal access by a malicious user or a malicious program is interrupted based on the current state of a user by hard disk and DVD controllers making illegal access to data impossible. Hardware-based access control is more secure than the protection provided by the operating systems as operating systems are vulnerable to malicious attacks by viruses and hackers. The data on hard disks can be corrupted after malicious access is obtained. With hardware-based protection, the software cannot manipulate the user privilege levels. A hacker or a malicious program cannot gain access to secure data protected by hardware or perform unauthorized privileged operations. This assumption is broken only if the hardware itself is malicious or contains a backdoor. The hardware protects the operating system image and file system privileges from being tampered with. Therefore, a completely secure system can be created using a combination of hardware-based security and secure system administration policies. === Backups === Backup is the process of reproducing copies of essential data and storing in a separate, secured place. It is used to ensure data that is lost can be recovered from another source. Backups contains a minimum of one copy of the data that requires preservation. It is considered essential to keep a backup of any data in most industries and the process is recommended for any files of importance to a user. There are 3 types of backups; full backups, incremental backups, and differential backups. Full backups secure all data from a production system, such as a server, database, or other connected data source. It is impossible to lose all data in a full backup if a breach or corruption were to occur. Full backups require a significantly large amount of time to back up and may be time-consuming taking hours to days to complete. Incremental backups only secures changed data since last backup. While all backups are done in full backups, incremental backups only save data that is recently or frequently changed. Incremental backups require lower storage costs making it a prominent solution for growing datasets. === Data Privacy === Data privacy (or information privacy) is the right for individual's data to be secured to obstruct the use of unauthorized access. It gives individuals control over their data and how it can be shared to third parties. The U.S Privacy Protection Law (see Privacy laws of the United States) requires organizations to inform individuals of how their data is collected and when a data breach occurs. By implementing an encryption, it ensures that private data is unreadable to cybercriminals. === Data masking === Data masking of structured data is the process of obscuring (masking) specific data within a database table or cell to ensure that data security is maintained and sensitive information is not exposed to unauthorized personnel. This may include masking the data from users (for example so banking customer representatives can only see the last four digits of a customer's national identity number), developers (who need real production data to test new software releases but should not be able to see sensitive financial data), outsourcing vendors, etc. Data masking is a form of encryption, as it obscures data by modifying particular letters and numbers to keep data concealed and protected from potential hackers. The individual that has access to the code that decrypts the replaced characters are the only ones that can uncover the data. === Data erasure === Data erasure (or data deletion, data destruction) is a method of software-based overwriting that permanently clears all electronic data residing on a hard drive or other digital media to ensure that no sensitive data is lost when an asset is retired or reused. Article 17: Right to be Forgotten states that users have the right to permanently remove all of their private information from their old devices/services to give people more control over their data. Users are able to switch between devices efficiently. == Threats == === Malware === Malware (or malicious software) is designed to destroy, corrupt or gain unauthorized access to a computer for the purpose of stealing, or destroying data. Hackers who use malware typically utilize many types of malware, which includes computer virus, computer worms, ransomware, spyware and Trojan horse to create a vast system of disruption and cause easy data theft. One of the victims of the vast system of disruption includes healthcare workers, who are targeted by compromised systems by infections and then having their data attacked. === Phishing === Phishing is a type of scam that allows hackers to hoax people using psychological and social engineering (using human emotions such as their trust and fear) tactics into giving personal data through emails and messages, and install computer viruses if the individual were to click on a malicious link unknowingly. Attackers are able to create websites that are very similar to original websites, which makes it difficult to detect a fake website, causing individuals to fall for giving in information. Phishing attackers use human emotion to exploit them, such as making them feel fear, urgency, sympathy with the message

    Read more →
  • Instagram face

    Instagram face

    Instagram face is a beauty standard based on the filters and influencers popular on Instagram. == Overview == An "Instagram face" has catlike eyes, long lashes, a small nose, high cheekbones, full lips, and a blank expression. Digital filters manipulate photographs and video to create an idealized image that, according to critics, has resulted in an unrealistic and homogeneous beauty standard. According to Jia Tolentino, the face is "distinctly white but ambiguously ethnic". The face has been described as a racial composite of different peoples. In 2024, cosmetic surgeon Paul Banwell said, "People used to come to see me asking to look like a particular celebrity, but many patients come to me now wanting to look like the filtered version of themselves." While based on digital filters, the look is achieved in person using heavy applications of makeup or cosmetic surgery. Plastic surgery, Botox injections, and injectable filler have significantly increased in popularity since the rise of digital filters. Influencers market makeup products designed to recreate the look. == History == The growth of reality television series and social media throughout the 2010s has influenced the popularity of Instagram face. In 2019, The New Yorker referred to this phenomenon as "Instagram Face," identifying Kim Kardashian as its "patient zero." Similarly, her younger sister Kylie Jenner significantly impacted the trend with her 2015 lip filler confession, which acted as a catalyst, introducing Juvéderm to a new generation. Sirin Kale of Vice News has described Jenner as "at the vanguard of an aesthetic that’s swept through British towns and cities," while also pointing towards other celebrities such as Iggy Azalea and Farrah Abraham. In 2018, Americans underwent 7 million neurotoxin injections and 2.5 million filler injections and spent $16.5 billion on cosmetic surgery. 92% of the latter was performed on women. Botox usage has also been on the rise. == Criticism == In her 2021 book The Selfie, Temporality, and Contemporary Photography, Claire Raymond of Princeton University criticised "Instagram faces" for erasing "heritable quirks and lived history; it erases what makes the human face so compelling, whether conventionally beautiful or not," while also arguing that the procedures used to create Instagram faces "numb and freeze the face and skin, rendering less mobile the lips, the eyes, and the neck. Numbness is the central feature of the experience for the woman who gets Instagram face through cosmetic procedures. Others may see her more, but she feels less and less." == Influence on popular culture == The increasing popularity of cosmetic surgeries towards a homogeneous ideal has resulted in the emergence of the "goopcore" sub-genre of body horror. The sub-genre combines graphic violence with body modifications from the beauty industry. Allie Rowbottom's goopcore novel Aesthetica centers around an influencer attempting to undo years of plastic surgery with a new experimental procedure.

    Read more →
  • Symbolic regression

    Symbolic regression

    Symbolic regression (SR) is a type of regression analysis that searches the space of mathematical expressions to find the model that best fits a given dataset, both in terms of accuracy and simplicity. No particular model is provided as a starting point for symbolic regression. Instead, initial expressions are formed by randomly combining mathematical building blocks such as mathematical operators, analytic functions, constants, and state variables. Usually, a subset of these primitives will be specified by the person operating it, but that's not a requirement of the technique. The symbolic regression problem for mathematical functions has been tackled with a variety of methods, including recombining equations most commonly using genetic programming, as well as more recent methods utilizing Bayesian methods and neural networks. Another non-classical alternative method to SR is called Universal Functions Originator (UFO), which has a different mechanism, search-space, and building strategy. Further methods such as Exact Learning attempt to transform the fitting problem into a moments problem in a natural function space, usually built around generalizations of the Meijer-G function. By not requiring a priori specification of a model, symbolic regression isn't affected by human bias, or unknown gaps in domain knowledge. It attempts to uncover the intrinsic relationships of the dataset, by letting the patterns in the data itself reveal the appropriate models, rather than imposing a model structure that is deemed mathematically tractable from a human perspective. The fitness function that drives the evolution of the models takes into account not only error metrics (to ensure the models accurately predict the data), but also special complexity measures, thus ensuring that the resulting models reveal the data's underlying structure in a way that's understandable from a human perspective. This facilitates reasoning and favors the odds of getting insights about the data-generating system, as well as improving generalisability and extrapolation behaviour by preventing overfitting. Accuracy and simplicity may be left as two separate objectives of the regression—in which case the optimum solutions form a Pareto front—or they may be combined into a single objective by means of a model selection principle such as minimum description length. It has been proven that symbolic regression is an NP-hard problem. Nevertheless, if the sought-for equation is not too complex it is possible to solve the symbolic regression problem exactly by generating every possible function (built from some predefined set of operators) and evaluating them on the dataset in question. == Difference from classical regression == While conventional regression techniques seek to optimize the parameters for a pre-specified model structure, symbolic regression avoids imposing prior assumptions, and instead infers the model from the data. In other words, it attempts to discover both model structures and model parameters. This approach has the disadvantage of having a much larger space to search, because not only the search space in symbolic regression is infinite, but there are an infinite number of models which will perfectly fit a finite data set (provided that the model complexity isn't artificially limited). This means that it will possibly take a symbolic regression algorithm longer to find an appropriate model and parametrization, than traditional regression techniques. This can be attenuated by limiting the set of building blocks provided to the algorithm, based on existing knowledge of the system that produced the data; but in the end, using symbolic regression is a decision that has to be balanced with how much is known about the underlying system. Nevertheless, this characteristic of symbolic regression also has advantages: because the evolutionary algorithm requires diversity in order to effectively explore the search space, the result is likely to be a selection of high-scoring models (and their corresponding set of parameters). Examining this collection could provide better insight into the underlying process, and allows the user to identify an approximation that better fits their needs in terms of accuracy and simplicity. == Benchmarking == === SRBench === In 2021, SRBench was proposed as a large benchmark for symbolic regression. In its inception, SRBench featured 14 symbolic regression methods, 7 other ML methods, and 252 datasets from PMLB. The benchmark intends to be a living project: it encourages the submission of improvements, new datasets, and new methods, to keep track of the state of the art in SR. === SRBench Competition 2022 === In 2022, SRBench announced the competition Interpretable Symbolic Regression for Data Science, which was held at the GECCO conference in Boston, MA. The competition pitted nine leading symbolic regression algorithms against each other on a novel set of data problems and considered different evaluation criteria. The competition was organized in two tracks, a synthetic track and a real-world data track. ==== Synthetic Track ==== In the synthetic track, methods were compared according to five properties: re-discovery of exact expressions; feature selection; resistance to local optima; extrapolation; and sensitivity to noise. Rankings of the methods were: QLattice PySR (Python Symbolic Regression) uDSR (Deep Symbolic Optimization) ==== Real-world Track ==== In the real-world track, methods were trained to build interpretable predictive models for 14-day forecast counts of COVID-19 cases, hospitalizations, and deaths in New York State. These models were reviewed by a subject expert and assigned trust ratings and evaluated for accuracy and simplicity. The ranking of the methods was: uDSR (Deep Symbolic Optimization) QLattice geneticengine (Genetic Engine) == Non-standard methods == Most symbolic regression algorithms prevent combinatorial explosion by implementing evolutionary algorithms that iteratively improve the best-fit expression over many generations. Recently, researchers have proposed algorithms utilizing other tactics in AI. Silviu-Marian Udrescu and Max Tegmark developed the "AI Feynman" algorithm, which attempts symbolic regression by training a neural network to represent the mystery function, then runs tests against the neural network to attempt to break up the problem into smaller parts. For example, if f ( x 1 , . . . , x i , x i + 1 , . . . , x n ) = g ( x 1 , . . . , x i ) + h ( x i + 1 , . . . , x n ) {\displaystyle f(x_{1},...,x_{i},x_{i+1},...,x_{n})=g(x_{1},...,x_{i})+h(x_{i+1},...,x_{n})} , tests against the neural network can recognize the separation and proceed to solve for g {\displaystyle g} and h {\displaystyle h} separately and with different variables as inputs. This is an example of divide and conquer, which reduces the size of the problem to be more manageable. AI Feynman also transforms the inputs and outputs of the mystery function in order to produce a new function which can be solved with other techniques, and performs dimensional analysis to reduce the number of independent variables involved. The algorithm was able to "discover" 100 equations from The Feynman Lectures on Physics, while a leading software using evolutionary algorithms, Eureqa, solved only 71. AI Feynman, in contrast to classic symbolic regression methods, requires a very large dataset in order to first train the neural network and is naturally biased towards equations that are common in elementary physics.

    Read more →
  • Multistage interconnection networks

    Multistage interconnection networks

    Multistage interconnection networks (MINs) are a class of high-speed computer networks usually composed of processing elements (PEs) on one end of the network and memory elements (MEs) on the other end, connected by switching elements (SEs). The switching elements themselves are usually connected to each other in stages, hence the name. MINs are typically used in high-performance or parallel computing as a low-latency interconnection (as opposed to traditional packet switching networks), though they could be implemented on top of a packet switching network. Though the network is typically used for routing purposes, it could also be used as a co-processor to the actual processors for such uses as sorting; cyclic shifting, as in a perfect shuffle network; and bitonic sorting. == Background == Interconnection network are used to connect nodes, where nodes can be a single processor or group of processors, to other nodes. Interconnection networks can be categorized on the basis of their topology. Topology is the pattern in which one node is connected to other nodes. There are two main types of topology: static and dynamic. Static interconnect networks are hard-wired and cannot change their configurations. A regular static interconnect is mainly used in small networks made up of loosely couple nodes. The regular structure signifies that the nodes are arranged in specific shape and the shape is maintained throughout the networks. Some examples of static regular interconnections are: Completely connected network In a mesh network, multiple nodes are connected with each other. Each node in the network is connected to every other node in the network. This arrangement allows proper communication of the data between the nodes. But, there are a lot of communication overheads due to the increased number of node connections. Shared busThis network topology involves connection of the nodes with each other over a bus. Every node communicates with every other node using the bus. The bus utility ensures that no data is sent to the wrong node. But, the bus traffic is an important parameter which can affect the system. RingThis is one of the simplest ways of connecting nodes with each other. The nodes are connected with each other to form a ring. For a node to communicate with some other node, it has to send the messages to its neighbor. Therefore, the data message passes through a series of other nodes before reaching the destination. This involves increased latency in the system. TreeThis topology involves connection of the nodes to form a tree. The nodes are connected to form clusters and the clusters are in-turn connected to form the tree. This methodology causes increased complexity in the network. Hypercube This topology consists of connections of the nodes to form cubes. The nodes are also connected to the nodes on the other cubes. ButterflyThis is one of the most complex connections of the nodes. As the figure suggests, there are nodes which are connected and arranged in terms of their ranks. They are arranged in the form of a matrix. In dynamic interconnect networks, the nodes are interconnected via an array of simple switching elements. This interconnection can then be changed by use of routing algorithms, such that the path from one node to other nodes can be varied. Dynamic interconnections can be classified as: Single stage Interconnect Network Multistage interconnect Network Crossbar switch connections == Crossbar Switch Connections == In crossbar switch, there is a dedicated path from one processor to other processors. Thus, if there are n inputs and m outputs, we will need nm switches to realize a crossbar. As the number of outputs increases, the number of switches increases by factor of n. For large network this will be a problem. An alternative to this scheme is staged switching. == Single Stage Interconnect Network == In a single stage interconnect network, the input nodes are connected to output via a single stage of switches. The figure shows 88 single stage switch using shuffle exchange. As one can see, from a single shuffle, not all input can reach all output. Multiple shuffles are required for all inputs to be connected to all the outputs. == Multistage Interconnect Network == A multistage interconnect network is formed by cascading multiple single stage switches. The switches can then use their own routing algorithm, or be controlled by a centralized router, to form a completely interconnected network. Multistage Interconnect Network can be classified into three types: Non-blocking: A non-blocking network can connect any idle input to any idle output, regardless of the connections already established across the network. Crossbar is an example of this type of network. Rearrangeable non-blocking: This type of network can establish all possible connections between inputs and outputs by rearranging its existing connections. Blocking: This type of network cannot realize all possible connections between inputs and outputs. This is because a connection between one free input to another free output is blocked by an existing connection in the network. The number of switching elements required to realize a non-blocking network in highest, followed by rearrangeable non-blocking. Blocking network uses least switching elements. == Examples == Multiple types of multistage interconnection networks exist. === Omega network === An Omega network consists of multiple stages of 22 switching elements. Each input has a dedicated connection to an output. An NN omega network has log2(N) stages and N/2 switching elements in each stage for a perfect shuffle between stages. Thus the network has complexity of 0(N log(N)). Each switching element can employ its own switching algorithm. Consider an 88 omega network. There are 8! = 40320 1-to-1 mappings from input to output. There are 12 switching element for a total permutation of 2^12 = 4096. Thus, it is a blocking network. === Clos network === A Clos network uses 3 stages to switch from N inputs to N outputs. In the first stage, there are r= N/n crossbar switches and each switch is of size nm. In the second stage there are m switches of size rr and finally the last stage is a mirror of the first stage with r switches of size mn. A clos network will be completely non-blocking if m >= 2n-1. The number of connections, though more than omega network is much less than that of a crossbar network. === Beneš network === A Beneš network is a rearrangeably non-blocking network derived from the clos network by initializing n = m = 2. There are (2log2(N) - 1) stages, with each stage containing N/2 22 crossbar switches. An 88 Beneš network has 5 stages of switching elements, and each stage has 4 switching elements. The center three stages has two 44 benes network. The 44 Beneš network, can connect any input to any output recursively.

    Read more →
  • Data dictionary

    Data dictionary

    A data dictionary, or metadata repository, as defined in the IBM Dictionary of Computing, is a "centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format". Oracle defines it as a collection of tables with metadata. The term can have one of several closely related meanings pertaining to databases and database management systems (DBMS): A document describing a database or collection of databases An integral component of a DBMS that is required to determine its structure A piece of middleware that extends or supplants the native data dictionary of a DBMS == Documentation == The terms data dictionary and data repository indicate a more general software utility than a catalogue. A catalogue is closely coupled with the DBMS software. It provides the information stored in it to the user and the DBA, but it is mainly accessed by the various software modules of the DBMS itself, such as DDL and DML compilers, the query optimiser, the transaction processor, report generators, and the constraint enforcer. On the other hand, a data dictionary is a data structure that stores metadata, i.e., (structured) data about information. The software package for a stand-alone data dictionary or data repository may interact with the software modules of the DBMS, but it is mainly used by the designers, users and administrators of a computer system for information resource management. These systems maintain information on system hardware and software configuration, documentation, application and users as well as other information relevant to system administration. If a data dictionary system is used only by the designers, users, and administrators and not by the DBMS Software, it is called a passive data dictionary. Otherwise, it is called an active data dictionary or data dictionary. When a passive data dictionary is updated, it is done so manually and independently from any changes to a DBMS (database) structure. With an active data dictionary, the dictionary is updated first and changes occur in the DBMS automatically as a result. Database users and application developers can benefit from an authoritative data dictionary document that catalogs the organization, contents, and conventions of one or more databases. This typically includes the names and descriptions of various tables (records or entities) and their contents (fields), plus additional details, like the type and length of each data element. Another important piece of information that a data dictionary can provide is the relationship between tables. This is sometimes referred to in entity-relationship diagrams (ERDs), or if using set descriptors, identifying which sets database tables participate in. In an active data dictionary constraints may be placed upon the underlying data. For instance, a range may be imposed on the value of numeric data in a data element (field), or a record in a table may be forced to participate in a set relationship with another record-type. Additionally, a distributed DBMS may have certain location specifics described within its active data dictionary (e.g. where tables are physically located). The data dictionary consists of record types (tables) created in the database by systems generated command files, tailored for each supported back-end DBMS. Oracle has a list of specific views for the "sys" user. This allows users to look up the exact information that is needed. Command files contain SQL Statements for CREATE TABLE, CREATE UNIQUE INDEX, ALTER TABLE (for referential integrity), etc., using the specific statement required by that type of database. There is no universal standard as to the level of detail in such a document. == Middleware == In the construction of database applications, it can be useful to introduce an additional layer of data dictionary software, i.e. middleware, which communicates with the underlying DBMS data dictionary. Such a "high-level" data dictionary may offer additional features and a degree of flexibility that goes beyond the limitations of the native "low-level" data dictionary, whose primary purpose is to support the basic functions of the DBMS, not the requirements of a typical application. For example, a high-level data dictionary can provide alternative entity-relationship models tailored to suit different applications that share a common database. Extensions to the data dictionary also can assist in query optimization against distributed databases. Additionally, DBA functions are often automated using restructuring tools that are tightly coupled to an active data dictionary. Software frameworks aimed at rapid application development sometimes include high-level data dictionary facilities, which can substantially reduce the amount of programming required to build menus, forms, reports, and other components of a database application, including the database itself. For example, PHPLens includes a PHP class library to automate the creation of tables, indexes, and foreign key constraints portably for multiple databases. Another PHP-based data dictionary, part of the RADICORE toolkit, automatically generates program objects, scripts, and SQL code for menus and forms with data validation and complex joins. For the ASP.NET environment, Base One's data dictionary provides cross-DBMS facilities for automated database creation, data validation, performance enhancement (caching and index utilization), application security, and extended data types. Visual DataFlex features provides the ability to use DataDictionaries as class files to form middle layer between the user interface and the underlying database. The intent is to create standardized rules to maintain data integrity and enforce business rules throughout one or more related applications. Some industries use generalized data dictionaries as technical standards to ensure interoperability between systems. The real estate industry, for example, abides by a RESO's Data Dictionary to which the National Association of REALTORS mandates its MLSs comply with through its policy handbook. This intermediate mapping layer for MLSs' native databases is supported by software companies which provide API services to MLS organizations. == Platform-specific examples == Developers use a data description specification (DDS) to describe data attributes in file descriptions that are external to the application program that processes the data, in the context of an IBM i. The sys.ts$ table in Oracle stores information about every table in the database. It is part of the data dictionary that is created when the Oracle Database is created. Developers may also use DDS context from free and open-source software (FOSS) for structured and transactional queries in open environments. == Typical attributes == Here is a non-exhaustive list of typical items found in a data dictionary for columns or fields: Entity or form name or their ID (EntityID or FormID). The group this field belongs to. Field name, such as RDBMS field name Displayed field title. May default to field name if blank. Field type (string, integer, date, etc.) Measures such as min and max values, display width, or number of decimal places. Different field types may interpret this differently. An alternative is to have different attributes depending on field type. Field display order or tab order Coordinates on screen (if a positional or grid-based UI) Default value Prompt type, such as drop-down list, combo-box, check-boxes, range, etc. Is-required (Boolean) - If 'true', the value cannot be blank, null, or only white-spaces Is-read-only (Boolean) Reference table name, if a foreign key. Can be used for validation or selection lists. Various event handlers or references to. Example: "on-click", "on-validate", etc. See event-driven programming. Format code, such as a regular expression or COBOL-style "PIC" statements Description or synopsis Database index characteristics or specification

    Read more →