Product Roadmap and Future development plan:
Our upcoming plans for blockchain-related development are outlined as follows:
6.1. Inference Endpoints Marketplace with zero-knowledge proof (verifiable ML)
Concept overview
Traditionally, model buyers may worry about sellers altering parameters to reduce computational costs or downgrading the model in real production. You want assurance that the machine learning model some entity claims has been run is indeed the one that ran. Examples include a case where a model is accessible behind an API, and the purveyor of a particular model has multiple versions – say, a cheaper, less accurate one, and a more expensive, higher-performance one. Without proofs, you have no way of knowing whether the purveyor is serving you the cheaper model when you’ve actually paid for the more expensive one (e.g., the purveyor wants to save on server costs and boost their profit margin).Specifically, so many people are complaining that the quality of GPT4 decreased significantly recently compared to what was promised in their demo and some of the initial days launched. However, the integration of zero-knowledge proof mechanisms on the blockchain can remove these concerns.With this technology, API inference endpoints become more efficient and reliable, as buyers no longer need to worry about the authenticity and quality of the model. Furthermore, model owners have the opportunity to crowdsource "model validation" tasks in real-time to users on our Xelora platform. This crowdsourcing approach ensures that the model's outputs are continuously validated, maintaining customer satisfaction with the model's quality and performance.
6.2 Consensus-driven labeling and model monitoring (on-chain)
In the realm of AI training and production, ensuring the accuracy and reliability of labeled data is paramount, yet it remains a significant challenge. Current methodologies in web2 often rely on a limited set of annotators whose biases and errors can inadvertently skew the data, leading to models that are less robust and potentially discriminatory. Furthermore, once AI models are deployed, monitoring their performance and integrity in a transparent and tamper-proof manner is difficult, especially in critical applications where trust is essential. Integrating a consensus-driven labeling and model monitoring system, leveraging blockchain technology, offers a promising solution to these problems. The consensus mechanism inherent in blockchain provides a decentralized and democratic approach to validate and verify labeled data, ensuring that it is not only accurate but also traceable later easily on chain. This method reduces the risk of bias and errors significantly. Moreover, by utilizing blockchain for model monitoring, every interaction and adjustment made to the AI model can be recorded in an immutable ledger, enhancing transparency and accountability.Here is how it could technically unfold:
Task Distribution: A smart contract distributes a data labeling task to a predetermined number of participants (e.g., 10 people) in the network. Each participant reviews the same piece of data and submits their label to the smart contract.
Consensus Threshold: The smart contract is programmed with a consensus threshold, in this case, 70% or more. This means that if at least 70% of participants agree on a particular label, the smart contract automatically accepts this label as the final consensus result for the data point.
Result Recording: Once the consensus is achieved, the smart contract records the agreed-upon label on the blockchain. This record includes the label itself, the task identifier, and possibly the identifiers of the participants who contributed to the consensus. This ensures transparency and traceability.
Cryptography Incentive Distribution: Participants who contributed to the consensus will be paid instantly and automatically with tokens as per the smart contract's terms. This incentivizes quality and honest participation.
To enhance the quality of labeling, contributors will be required to stake tokens as a commitment to accountability unless they will lose their own money. This mechanism encourages them to maintain higher standards and deliver superior quality work.
6.3 Crowdsourcing marketplace for AI development on chain
Challenges in Traditional Crowdsourcing Marketplaces such as Amazon Mechanical Turk:
High Fees and Margin Costs: Platforms like Amazon Mechanical Turk often charge significant fees (20%), reducing the earnings of task contributors and increasing costs for task requesters. These platforms act as intermediaries, necessitating higher margins to cover operational costs.
Lack of Transparency: Traditional platforms may not provide detailed insight into how tasks are distributed, completed, or how disputes are resolved. This lack of transparency can lead to trust issues among participants.
Centralized Control: Centralized platforms have complete control over the rules, fee structures, and the distribution of tasks, which can lead to imbalances in power and potential censorship or bias in task distribution.
Limited Trust and Security: Participants must trust the platform for payment processing, personal data security, and fair dispute resolution, which can be problematic if the platform fails to uphold high standards of integrity and security.
Inefficient Dispute Resolution: Centralized platforms often have slow and bureaucratic dispute resolution processes, which can be frustrating for both task requesters and contributors.
Slow payment process for contributors: traditionally, contributors have to wait for at least 30 days to receive the payment and the traditional payment methods always charge fee (such as 4.4% for paypal).
Blockchain-Based Crowdsourcing Marketplace Advantages:
Reduced Costs through Decentralization: By operating on the blockchain, the marketplace can significantly reduce or even eliminate high platform fees. Smart contracts automate task distribution, completion verification, and instant payments, reducing the need for a costly intermediary.
Enhanced Transparency and Trust: All transactions, including task postings, completions, and payments, are recorded on the blockchain, providing an immutable and transparent audit trail. This transparency builds trust among participants.
Democratic Task Governance: Utilizing a consensus mechanism for task validation and dispute resolution democratizes the marketplace. Participants can have a say in important decisions, such as changes to the platform's rules or dispute outcomes, based on a consensus model.
Secure, Direct and instant Payments: Payments are made directly, automatically and instantly between participants through token-based systems, secured by blockchain technology. This reduces the risk of fraud and non-payment.
Efficient and Fair Dispute Resolution: Disputes can be resolved through a consensus mechanism, where selected or interested participants can vote on the outcome. This process can be quicker and perceived as fairer compared to centralized arbitration.
Incentive and Reputation Systems: The platform can incorporate token-based incentives for high-quality work and participation in governance. A transparent reputation system, recorded on the blockchain, can help ensure reliability and quality of work.
By addressing the limitations of traditional crowdsourcing platforms with a blockchain-based approach, this new marketplace can offer a more equitable, transparent, and efficient environment for AI development tasks. The integration of consensus mechanisms not only for task validation but also for governance and dispute resolution introduces a level of democratic participation and trust that is often lacking in centralized platforms.
6.4 Launchpad for AI projects
Our ecosystem offers a comprehensive suite of tools for AI projects, and as part of our ongoing commitment to support innovation, we also plan to introduce a launchpad for AI projects. This platform will offer them the opportunity to secure funding directly from the community through token sales. Our fee structure will be based on the amount of funds each project successfully raises or fixed fees.
6.5 Decentralized dataset pool: (start from xxx USD/mon to use this decentralized dataset pool) The final pricing will be announced later.
Xelora is committed to building a decentralized, high-quality AI dataset pool to support the AI community in early 2025. One of our core strengths lies in our extensive background in the AI data field, which allows us to contribute a vast collection of rare and valuable real-world datasets to kickstart the pool. These datasets include but not limited to:
Real call center audio data in multiple languages
Diverse image datasets representing various ethnicities
Medical datasets in DICOM format
Authentic medical dictation audio datasets
These datasets, valued at several million USD in the current market, will form the foundation of the pool. By contributing this invaluable data, we aim to establish a strong initial foundation to attract early customers. This, in turn, will incentivize more new contributors to join the ecosystem, creating a robust and sustainable decentralized AI dataset marketplace.One of the key advantages for customers is a simple flat monthly subscription fee. With this model, customers gain access to a continuously expanding pool of high-quality datasets as new contributions are added, ensuring long-term value and scalability for their AI development needs.1. Key Factors for Point Distribution to distribute revenue sharing for contributors:Points will be awarded based on the following three factors:
Dataset Quality:
Low Quality = 0 points
Average Quality = 10 points
High Quality = 20 points
Determined by experienced validators who evaluate the dataset for its completeness, accuracy, and relevance.
Dataset Usage:
Each paid account can only contribute one counted usage per dataset.
10 points per usage will be awarded for each unique account accessing the dataset.
This ensures that dataset usage is highly rewarded but prevents the abuse of fake accounts inflating usage points.
Dataset Demand/Niche Bonus:
Datasets that are rare, specialized, or high in demand (e.g., niche medical data, underrepresented languages) will receive an additional 10 points as a niche bonus.
2. Revenue Pool Allocation
50% of the $130 (tentative) subscription fees will be shared with contributors who provide qualified datasets.
The remaining 50% of the subscription revenue will be split as follows:
35% for Validators: To reward the qualified validators for reviewing datasets and ensuring quality.
15% for the Platform Provider: To cover platform operational costs, system maintenance, and infrastructure.
3. Revenue Sharing FormulaContributors will receive a share of the total 50% revenue pool based on the points they have accumulated:
Contributor's Revenue Share (%) = (Contributor’s Points / Total Points in the System) × 100
Contributor’s Revenue = (Contributor’s Revenue Share %) × 50% of Total Revenue Pool
4. Point Example for ContributorLet’s say a contributor submits 10 datasets, each with the following characteristics:
Quality: High (20 points each)
Usage: 10 unique accounts access each dataset (10 points per usage, totaling 100 points per dataset)
Niche Bonus: Yes (10 points per dataset)
Total Points for the Contributor = (10 datasets × (20 + 100 + 10)) = 1,300 points.5. Limit on Revenue Share (X%)To prevent any single contributor from monopolizing the revenue, a cap of 15% is placed on the maximum revenue share any one contributor can earn from the pool.
Max Revenue Share = 15% of the 50% revenue pool
For example, if the 50% revenue pool is $50,000, no single contributor can earn more than $7,500.
6. Validator System
Validators must be qualified professionals with experience in the AI data labeling industry. They will be responsible for evaluating the quality of datasets.
35% of the total subscription revenue will be allocated to validators, ensuring they are well-rewarded for their contributions to maintaining data quality.
Validators may be required to stake tokens to incentivize accountability and quality assessment.
7. Fraud Prevention and Usage Restrictions
Paid accounts only: Only paid users (those who pay the $130 subscription fee) can access datasets, significantly reducing the incentive for creating fake accounts to manipulate usage.
One counted usage per dataset per account: Each paid account can only generate one counted usage per dataset. This prevents bad actors from inflating usage by repeatedly accessing the same dataset.
Fraud Detection: Automated tools will monitor suspicious usage patterns and flag accounts that attempt to manipulate the system.
8. Dynamic Point Systems and Additional Protections
Diminishing Returns for High Usage: If abuse patterns are detected, the system can introduce diminishing returns for high usage counts (e.g., points decrease after the first 10 usages from unique accounts).
Revenue Pool Cap: The system can adjust the revenue pool allocation dynamically based on contributor behavior, ensuring fairness across the board.
9. Final Workflow for Contributors
Dataset Submission: Contributors submit datasets and earn points based on quality, usage, and demand.
Revenue Distribution: At the end of each cycle (e.g., monthly), contributors receive their share of the 50% revenue pool based on the points they have accumulated.
Validator and Platform Rewards: Validators receive their share from the 35% pool, and the platform retains 15% for operational costs.
Monitoring & Updates: The system continuously monitors usage patterns and updates the point system if necessary to maintain fairness.
Adjustments to Revenue Sharing:
50% of the total subscription revenue will be distributed to contributors based on points.
35% will be allocated to validators.
15% will be kept by the platform provider for operational costs.
Last updated