Writing Complex SQLite Queries for Film Database Analysis

Listing Directors Who Directed Films in Leap Years

To list all directors who directed a film in a leap year, we need to understand the relationship between the DIRECTOR, MOVIE_DIRECTOR, and MOVIE tables. The MOVIE table contains the year column, which we will use to determine if a year is a leap year. A leap year is divisible by 4, but not by 100, unless it is also divisible by 400. This logic must be embedded in the SQL query.

The query involves joining the DIRECTOR, MOVIE_DIRECTOR, and MOVIE tables. The MOVIE_DIRECTOR table acts as a bridge between DIRECTOR and MOVIE, allowing us to link directors to the movies they directed. The MOVIE table provides the year of the movie, which we will filter to include only leap years.

Here is the SQL query:

SELECT DISTINCT D.fname, D.lname
FROM DIRECTOR D
JOIN MOVIE_DIRECTOR MD ON D.did = MD.did
JOIN MOVIE M ON MD.mid = M.mid
WHERE (M.year % 4 = 0 AND M.year % 100 != 0) OR (M.year % 400 = 0);

This query selects distinct director names (fname and lname) from the DIRECTOR table. It joins the DIRECTOR table with the MOVIE_DIRECTOR table on the did column, and then joins the result with the MOVIE table on the mid column. The WHERE clause filters the results to include only those movies made in leap years.

Identifying Films with the Largest Cast

Finding the film(s) with the largest cast involves counting the number of actors associated with each movie in the CAST table and then identifying the movie(s) with the maximum count. This requires aggregating data from the CAST table and joining it with the MOVIE table to get the movie names.

The query uses a subquery to first determine the maximum number of actors in any movie. Then, it selects the movies that have this maximum number of actors.

Here is the SQL query:

SELECT M.name, COUNT(C.aid) AS cast_size
FROM MOVIE M
JOIN CAST C ON M.mid = C.mid
GROUP BY M.mid
HAVING cast_size = (SELECT MAX(cast_count)
                    FROM (SELECT COUNT(aid) AS cast_count
                          FROM CAST
                          GROUP BY mid));

This query groups the results by movie ID (mid) and counts the number of actors (aid) for each movie. The HAVING clause filters the groups to include only those with a cast size equal to the maximum cast size found in the subquery.

Finding Actors Who Acted Only in Films Before 1960

To find actors who acted only in films before 1960, we need to ensure that all movies an actor has acted in were released before 1960. This involves checking the CAST table for actor-movie pairs and then filtering based on the year column in the MOVIE table.

The query uses a NOT EXISTS clause to exclude actors who have acted in any movie released in or after 1960. This ensures that only actors who have exclusively acted in pre-1960 films are selected.

Here is the SQL query:

SELECT A.fname, A.lname
FROM ACTOR A
WHERE NOT EXISTS (SELECT 1
                  FROM CAST C
                  JOIN MOVIE M ON C.mid = M.mid
                  WHERE C.aid = A.aid AND M.year >= 1960);

This query selects actor names from the ACTOR table where there does not exist a movie in the CAST and MOVIE tables that the actor has acted in and that was released in or after 1960.

Discovering Films with More Women Actors Than Men

To find films with more women actors than men, we need to count the number of male and female actors in each film and compare these counts. This involves joining the CAST table with the ACTOR table to access the gender column and then aggregating the data by movie.

The query uses conditional aggregation to count male and female actors separately and then compares these counts in the HAVING clause.

Here is the SQL query:

SELECT M.name
FROM MOVIE M
JOIN CAST C ON M.mid = C.mid
JOIN ACTOR A ON C.aid = A.aid
GROUP BY M.mid
HAVING SUM(CASE WHEN A.gender = 'female' THEN 1 ELSE 0 END) > 
       SUM(CASE WHEN A.gender = 'male' THEN 1 ELSE 0 END);

This query groups the results by movie ID (mid) and uses conditional aggregation to count the number of female and male actors. The HAVING clause filters the groups to include only those where the count of female actors is greater than the count of male actors.

Implementing PRAGMA journal_mode and Database Backup

To ensure data integrity and prevent corruption, especially after unexpected shutdowns or power failures, it is crucial to implement proper journaling modes and regular database backups in SQLite. The PRAGMA journal_mode command can be used to set the journaling mode, which controls how SQLite handles transactions and recovery.

The WAL (Write-Ahead Logging) mode is often recommended for its performance benefits and robustness. It allows reads to proceed concurrently with writes, which can significantly improve performance in multi-user environments.

Here is how to set the journal mode to WAL:

PRAGMA journal_mode=WAL;

Regular database backups are also essential. SQLite provides the .backup command in the SQLite command-line interface (CLI) to create a backup of the database. This command copies the entire database to a specified file, ensuring that the backup is consistent and can be restored in case of failure.

Here is an example of how to create a backup using the SQLite CLI:

sqlite3 original.db ".backup backup.db"

This command creates a backup of original.db and saves it as backup.db.

By combining proper journaling modes with regular backups, you can significantly enhance the reliability and durability of your SQLite database, ensuring that your data remains safe and recoverable even in the face of unexpected failures.

Related Guides

Leave a Reply

Your email address will not be published. Required fields are marked *