mo/mitlist

Mohamad 0cd1b9c84d feat: Implement soft delete functionality and remove cascading deletes

This commit introduces a significant update to the database schema and models by implementing soft delete functionality across multiple tables, including users, groups, lists, items, expenses, and more. Key changes include:

- Addition of `deleted_at` and `is_deleted` columns to facilitate soft deletes.
- Removal of cascading delete behavior from foreign key constraints to prevent accidental data loss.
- Updates to the models to incorporate the new soft delete mixin, ensuring consistent handling of deletions across the application.
- Introduction of a new endpoint for group deletion, requiring owner confirmation to enhance data integrity.

These changes aim to improve data safety and compliance with data protection regulations while maintaining the integrity of related records.

2025-06-25 20:16:28 +02:00

17 KiB

Raw Blame History

Backend Critical Issues & TODOs

This document outlines critical issues found in the backend codebase that require immediate attention. The issues range from severe data integrity risks to security vulnerabilities and poor API design.

1. Catastrophic Data Loss via Cascading Deletes

Severity: CRITICAL

Problem: The SQLAlchemy models in be/app/models.py are configured with ondelete="CASCADE" and cascade="all, delete-orphan" on relationships to critical shared data like Expense, List, Settlement, and Chore. This creates a chain reaction of data destruction that can be triggered by a single user's action, affecting the data of all other users in a group.

Specific Scenarios:

Deleting a User: When a user is deleted, all expenses they paid for, created, or were a part of (via ExpenseSplit) are deleted. All settlements they participated in are also deleted. This permanently corrupts the financial history of every group the user was a member of, violating the data integrity for all other members.
Deleting a Group: Deleting a group permanently deletes all associated data, including every expense, shopping list, item, settlement, and the entire chore history.
Deleting a List: Deleting a shopping list deletes all associated expenses.

Recommendation:

Immediately remove all ondelete="CASCADE" and cascade="all, delete-orphan" configurations from the models where they affect shared data.
Implement a soft-delete or anonymization strategy for user deletion to comply with data protection regulations (like GDPR's "right to be forgotten") without destroying the integrity of other users' data. When a user is "deleted," their PII should be scrubbed, but their ID and associated records should remain to keep financial history intact.

2. Hidden Group Deletion in "Leave Group" Endpoint

Severity: CRITICAL

Problem: There is no explicit DELETE /groups/{group_id} endpoint. Instead, the logic for deleting a group is hidden within the DELETE /groups/{group_id}/leave endpoint in be/app/api/v1/endpoints/groups.py. If the last remaining member of a group, who is also the owner, calls this endpoint, the entire group and all its associated data are silently and permanently deleted.

Why this is a problem:

Non-Obvious Destruction: A user would never expect a "leave" action to be destructive for the entire group.
No Confirmation: There is no "are you sure?" check. The action is immediate and irreversible.
High Risk of Accidental Deletion: An owner could easily remove all members and then leave, accidentally wiping all data.

Recommendation:

Create a separate, explicit DELETE /groups/{group_id} endpoint.
This new endpoint must be protected and should only be accessible to the group owner.
Implement multiple confirmation steps before allowing the deletion (e.g., requiring the user to type the group's name to confirm).
The "leave group" endpoint should never result in the deletion of the group. If the last member leaves, the group should be archived or left empty, but not destroyed.

3. User Deletion Triggers Data Cascade

Severity: CRITICAL

Problem: The user management routes in be/app/api/v1/endpoints/users.py use the fastapi-users library. The configuration in be/app/auth.py uses the default BaseUserManager, which performs a hard delete on the user record in the database.

Why this is a problem: This is the trigger for the catastrophic data loss described in issue #1. A simple call to the DELETE /users/me endpoint will initiate the cascade, destroying shared financial records across multiple groups. This is a direct violation of data integrity for other users.

Recommendation:

Override the default delete method in the UserManager class in be/app/auth.py.
The new method must implement the anonymization strategy discussed in issue #1. It should scrub the user's PII and mark the account as inactive, but not delete the row from the database.

4. Inconsistent and Flawed Authorization in Financial Endpoints

Severity: HIGH

Problem: Authorization logic is inconsistent across the financial endpoints in be/app/api/v1/endpoints/financials.py, creating security gaps and potential for misuse.

Specific Scenarios:

Creating vs. Updating Expenses: A regular user can't create an expense paid by someone else, but they can modify an existing expense as long as they were the original payer. This is inconsistent. A better approach would be to only allow group owners to modify financial records created by others.
Deleting Settlements: The delete_settlement_record endpoint dangerously allows the user who created the settlement record to delete it. A user could pay a debt, have the recipient confirm it, and then the payer could simply delete the record of the payment. Only group owners should have the power to delete financial records, and it should be a heavily audited action.

Recommendation:

Establish a clear and consistent authorization policy. A good default policy would be:
1. Users can create and modify their own financial entries (e.g., expenses they paid for).
2. Only group owners can modify or delete financial entries created by other users.
Refactor all financial endpoints (expenses, settlements) to strictly and consistently enforce this policy.

5. No Protection Against Over-Settlement of Debts

Severity: HIGH

Problem: The record_settlement_for_expense_split endpoint in financials.py does not validate if the settlement amount exceeds the amount owed for that expense split.

Why this is a problem: A malicious user could exploit this by "settling" a $5 debt with a $500 payment. The system would record this, creating a false record that the original payer now owes the malicious user $495. This can be used to create phantom debts.

Recommendation:

In crud_settlement_activity.create_settlement_activity, add validation to ensure the amount_paid in a settlement activity does not exceed the remaining owed_amount on the ExpenseSplit.
If an overpayment is attempted, the API should return a 400 Bad Request error with a clear message.

6. Flawed Business Logic Preventing Group Expenses from Personal Lists

Severity: MEDIUM

Problem: The create_new_expense endpoint prevents a user from creating a group expense that is associated with one of their personal shopping lists.

Why this is a problem: This doesn't reflect a real-world use case. A user might use a personal list to do shopping for a group and then need to file the expense against that group. The current logic forces an artificial separation.

Recommendation:

Remove the validation check that blocks this action. A user should be able to link a group expense to any list they have access to, regardless of whether it's a personal or group list. The critical factor is the group_id on the expense itself, not on the list it's optionally linked to.

7. Unrestricted Access to All Chore History

Severity: CRITICAL

Problem: The chore history endpoints in be/app/api/v1/endpoints/chores.py lack any authorization checks.

Specific Endpoints:

GET /{chore_id}/history
GET /assignments/{assignment_id}/history

Any authenticated user can access the historical data for any chore or assignment in the entire database simply by guessing its ID. This allows users from one group to spy on the activities of other groups, which is a major data leak.

Recommendation:

Add strict authorization checks to these endpoints immediately.
Before returning any history, the code must verify that the current_user is a member of the group to which the chore or assignment belongs.
For personal chores, it must verify that the current_user is the creator of the chore.

8. Redundant and Confusing Chore Endpoints

Severity: LOW

Problem: The API has a completely separate set of endpoints for "Personal Chores" (e.g., POST /personal, DELETE /personal/{chore_id}). It also provides generic endpoints (PUT /{chore_id}) that can perform the same actions and more, such as converting a personal chore to a group chore.

Why this is a problem:

API Bloat: It makes the API larger and more confusing than necessary.
Maintenance Burden: Having duplicate logic increases the chance of bugs and makes the codebase harder to maintain. For example, if a business rule for updates changes, it might need to be updated in multiple places.
Inconsistency: The update_personal_chore and update_group_chore endpoints have slightly different logic and permission checks than the generic update_chore_any_type endpoint, which is a recipe for subtle bugs.

Recommendation:

Deprecate and remove the separate /personal/* and /groups/{group_id}/chores/* endpoints for POST, PUT, and DELETE.
Consolidate all create, update, and delete logic into a single set of endpoints (e.g., POST /chores, PUT /chores/{chore_id}, DELETE /chores/{chore_id}).
This single set of endpoints should contain clear, unified logic that handles permissions based on the chore's type (personal or group).

9. Sensitive User Data Leaked in Application Logs

Severity: MEDIUM

Problem: Across multiple endpoints (financials.py, groups.py, chores.py, etc.) the code logs user email addresses and sometimes internal object IDs at the INFO or WARNING level. Example:

logger.info(f"User {current_user.email} creating expense: {expense_in.description}")

Logging personally identifiable information (PII) such as email addresses at non-debug levels creates a compliance risk (GDPR/Datenschutz) and can leak data if log files are exposed.

Recommendation:

Remove PII (emails, names) from log messages at INFO/WARNING/ERROR levels.
If user identifiers are required, use the numeric user.id or a hashed identifier.
Move any detailed, user‐specific logging to DEBUG level with redaction.

10. Detailed Internal Errors Returned to Clients

Severity: MEDIUM

Problem: Many except blocks convert low-level SQLAlchemy exceptions directly into HTTP 400/500 responses with detail=str(e). This leaks internal table names, constraint names, and stack traces to the client.

Example: crud/expense.py and other CRUD modules.

Recommendation:

Map all internal exceptions to generic error messages for clients (e.g., "Database error, please try again").
Log the original exception server-side (with redaction) but never expose raw error strings to API consumers.

11. Insecure Default Configuration Values

Severity: HIGH

Problem: app/config.py sets insecure defaults that may accidentally reach production:

SESSION_SECRET_KEY = "your-session-secret-key" – a published default secret.
ACCESS_TOKEN_EXPIRE_MINUTES = 480 (8 h) – excessive token lifetime increases risk if a token is leaked.
Wide-open CORS_ORIGINS allowing any localhost ports.

Recommendation:

Make SESSION_SECRET_KEY mandatory via env var with no default. Fail hard if missing.
Reduce default ACCESS_TOKEN_EXPIRE_MINUTES to ≤60 and allow override via env.
In production mode (settings.is_production), restrict CORS origins to the canonical frontend domain list only.

12. Missing Global Rate Limiting / Brute-Force Protection

Severity: MEDIUM

Problem: The API exposes authentication (JWT login) and sensitive endpoints without any rate-limiting or brute-force protection, leaving it vulnerable to password-guessing and scraping attacks.

Recommendation:

Integrate a middleware-level rate limiter (e.g., slowapi, fastapi-limiter) backed by Redis.
Apply stricter per-IP limits on /auth/jwt/login, OTP verification, and password-reset endpoints.

13. Recurring Expense Job Never Commits Transactions

Severity: HIGH

Problem: be/app/jobs/recurring_expenses.py uses db.flush() inside _generate_next_occurrence but never calls commit() on the AsyncSession. Because the helper is executed inside the scheduler wrapper run_recurring_expenses_job, the surrounding context manager closes the session without an explicit commit, so Postgres rolls the transaction back. No newly-generated expenses are permanently saved and the max_occurrences counter is not decremented. At the next scheduler tick the exact same expenses are re-generated, flooding the database and the UI with duplicates.

Recommendation:

Replace the manual flush() calls with a single transactional block (e.g. async with db.begin(): …) and call await db.commit() once per expense or after the batch loop.
Add an integration test that verifies a generated expense actually persists after the job finishes.
Consider running the scheduler job inside a database advisory lock to avoid multiple workers generating duplicates concurrently.

14. Blocking, Mixed-Sync Scheduler Configuration Can Starve the Event Loop

Severity: MEDIUM

Problem: be/app/core/scheduler.py converts the async database URL to a sync URL (postgresql://) for SQLAlchemyJobStore and registers the job with ThreadPoolExecutor. This combination forces a blocking, synchronous DB driver (psycopg2) to execute inside FastAPI's event loop thread pool. Under load, long-running jobs will consume the limited thread pool, delay HTTP requests, and may exhaust the sync connection pool, causing 5xx errors throughout the API layer.

Recommendation:

Use AsyncIOJobStore with the existing postgresql+asyncpg URL, or move scheduled tasks to a dedicated worker (e.g. Celery, RQ) outside the API process.
If synchronous execution is unavoidable, enlarge the job executor pool and run the scheduler in a separate process so it cannot impact request latency.

15. Public Root & Health Endpoints Leak Deployment Metadata

Severity: MEDIUM

Problem: Unauthenticated GET / and GET /health responses include the environment ("development", "staging", "production") and full semantic version string straight from app/config.py. Attackers can fingerprint deployment targets and time their exploits around release cycles.

Recommendation:

Limit environment & version details to authenticated admin users or internal monitoring networks.
Replace the public response with a minimal { "status": "ok" } payload.
Add a separate, authenticated diagnostics endpoint for detailed build info.

16. Wildcard-Style CORS With `allow_credentials=True` Enables Session Hijacking

Severity: HIGH

Problem: be/app/main.py registers CORSMiddleware with the entire settings.cors_origins_list and sets allow_credentials=True, which instructs browsers to send cookies and HTTP-Only session tokens to any origin in that list. Because the default list contains every localhost port and is hard-coded, a malicious site running on another local port can silently read or mutate a user's data via XHR once the victim logs in to the real front-end.

Recommendation:

In production, populate CORS_ORIGINS from env and restrict it to the canonical front-end domain(s).
Set allow_credentials=False unless cookie-based auth is absolutely required—JWTs are already used for API auth.
Use the strict-origin-when-cross-origin referrer policy and enable CSRF protection if cookies stay enabled.

17. Groups & Settlements Lack Optimistic-Locking Enforcement

Severity: MEDIUM

Problem: The groups, settlements, and settlement_activities tables include a version column, but none of the corresponding endpoints check or increment that version during PUT/DELETE operations (unlike lists and items). Concurrent updates therefore overwrite each other silently, leading to lost data and inconsistent financial history.

Recommendation:

Mirror the pattern used in the Lists API: require clients to supply the expected version and return 409 Conflict on mismatch.
Increment version in SQL via UPDATE … SET version = version + 1 inside the same transaction to guarantee atomicity.

18. Redis Connection Pool Never Closed

Severity: LOW

Problem: be/app/core/redis.py creates a module-level redis_pool but the pool is never closed on application shutdown. In long-running worker processes (e.g. during Kubernetes rolling updates) this leaks file descriptors and keeps TCP connections open until the OS forcibly times them out.

Recommendation:

Add a lifespan handler or shutdown event to call await redis_pool.aclose().
Use a factory in the DI system to create and dispose Redis connections per request where feasible.

17 KiB Raw Blame History Unescape Escape

Backend Critical Issues & TODOs

1. Catastrophic Data Loss via Cascading Deletes

2. Hidden Group Deletion in "Leave Group" Endpoint

3. User Deletion Triggers Data Cascade

4. Inconsistent and Flawed Authorization in Financial Endpoints

5. No Protection Against Over-Settlement of Debts

6. Flawed Business Logic Preventing Group Expenses from Personal Lists

7. Unrestricted Access to All Chore History

8. Redundant and Confusing Chore Endpoints

9. Sensitive User Data Leaked in Application Logs

10. Detailed Internal Errors Returned to Clients

11. Insecure Default Configuration Values

12. Missing Global Rate Limiting / Brute-Force Protection

13. Recurring Expense Job Never Commits Transactions

14. Blocking, Mixed-Sync Scheduler Configuration Can Starve the Event Loop

15. Public Root & Health Endpoints Leak Deployment Metadata

16. Wildcard-Style CORS With allow_credentials=True Enables Session Hijacking

17. Groups & Settlements Lack Optimistic-Locking Enforcement

18. Redis Connection Pool Never Closed

17 KiB

Raw Blame History

16. Wildcard-Style CORS With `allow_credentials=True` Enables Session Hijacking