MEASURE 2.1: Test sets, metrics, and details about the tools used during TEVV are documented.
MEASURE 2.2: Evaluations involving human subjects meet applicable requirements (including human subject protection) and are representative of the relevant population.
MEASURE 2.3: AI system performance or assurance criteria are measured qualitatively or quantitatively and demonstrated for conditions similar to deployment setting(s). Measures are documented.
MEASURE 2.4: The functionality and behavior of the AI system and its components - as identified in the MAP function - are monitored when in production.
MEASURE 2.5: The AI system to be deployed is demonstrated to be valid and reliable. Limitations of the generalizability beyond the conditions under which the technology was developed are documented.
MEASURE 2.6: The AI system is evaluated regularly for safety risks - as identified in the MAP function. The AI system to be deployed is demonstrated to be safe, its residual negative risk does not exceed the risk tolerance, and it can fail safely, particularly if made to operate beyond its knowledge limits. Safety metrics reflect system reliability and robustness, real-time monitoring, and response times for AI system failures.
MEASURE 2.7: AI system security and resilience - as identified in the MAP function - are evaluated and documented.
MEASURE 2.8: Risks associated with transparency and accountability - as identified in the MAP function - are examined and documented.
MEASURE 2.9: The AI model is explained, validated, and documented, and AI system output is interpreted within its context -as identified in the MAP function - to inform responsible use and governance.
MEASURE 2.10: Privacy risk of the AI system - as identified in the MAP function - is examined and documented.
MEASURE 2.11: Fairness and bias - as identified in the MAP function - are evaluated and results are documented.
MEASURE 2.12: Environmental impact and sustainability of AI model training and management activities - as identified in the MAP function - are assessed and documented.
MEASURE 2.13: Effectiveness of the employed TEVV metrics and processes in the MEASURE function are evaluated and documented.