DIY AI Part 5: How to set up a central scheduler for your AI project

Matic Zorman/GettyImages

If you have been following along with our DIY AI program, you now have your file system set up and are ready to start coding. If this is your first visit, it’s a good idea to start at the beginning, as we cover several things already, including setting up Visual Studio Code, installing important Python libraries, and setting up your virtual environment. In this guide, we are going to set up the central scheduler, which will be the first script in our DIY AI development project. If you think of the virtual environment as the brain and the filesystem as the body, the central scheduler is the heart.

What is the central scheduler?

The central scheduler will ensure that the scripts we write will run automatically and on time. A lot of AI work will involve gathering data to study at specific times. For instance, to learn about the file system on your computer, the files you regularly use, and the files that might be harmful to you, you might want the AI to scan your drives every night while you sleep. The central scheduler will help you do that.

Once your program is underway, you might find that you have a lot of files to run. This script will help you stay on top of it by implementing a simple system that lets you add and remove tasks quickly and easily.

Before you begin

Choose your storage method

One of the things you’ll find about DIY is that there are lots of different ways to do things, and creating a central scheduler is just one script in this project that will be open to variations in how you accomplish it. The main worry in this script is that you will likely need to change what scripts run and when frequently as both a developer and a user of the program. Therefore,  the code will need to communicate with an external source to see which scripts to run. Otherwise, you would need to rewrite the script every time your needs change.

Popular options include using a JSON file or database, like Microsoft Access or SQLite. JSON is fine when you are only running a few files, and it can be quick to add and remove tasks, but the file can get large and hard to manage when your program starts getting larger. I recommend going with a Microsoft Access database, and it’s the option we are going to take just because I already use it. However, there are plenty of other database options, and the coding will largely be the same if you prefer to use something else. Every database will use tables that you can set up like we do here and then access using Python to know which scripts to run and when. We’ll also be able to use different tables in the same database to store all kinds of data for our AI to use.

Setting up the database table

Your table will require six fields. One automatically sets the ID of the entry, one is a date and time., and the other four are short text. I recommend saving the table as “ai_tasks”, or anything besides “tasks” that many programs use.

Install additional libraries

The libraries we installed in a past guide serve as the main framework for our program. Still, we may need to add additional libraries when the need arises, and this is such an occasion. We will need to install the “pyodbc” library as well as the “schedule” library.

The “pyodbc” library works with all of the most common databases, including Microsoft Office, Oracle, MySQL, and others. The “schedule” library makes scheduling a breeze with its easy syntax and powerful features.

To install them, type this code in the prompt while in the virtual environment:

pip install pyodbc schedule

The Python script

With the libraries installed, we can start writing the Python script that will check our database for scripts that we want to run at specific intervals. Let’s go over each part of this script, and then I’ll give you the whole thing to copy and paste.

Import the libraries

Even though we installed the libraries, we still need to import them to use them, and that is what this part of the code is doing, importing the schedule, time, datetime, and pyodbc libraries.

import schedule

import time

from datetime import datetime

import pyodbc

Database connection

This next part of the script will handle connecting to the database. You will need to change the database path to the location of the database holding the table we created earlier.

DB_PATH = r"C:\Path\To\TaskScheduler.accdb" # Update with your file path

CONN_STR = f"DRIVER={{Microsoft Access Driver (*.mdb, *.accdb)}};DBQ={DB_PATH};"

Fetching Tasks from the Database

We need the load_tasks function to fetch the information from the table we created earlier. You will need to replace your_table_name with the name you chose for your table.

def load_tasks():

conn = pyodbc.connect(CONN_STR)

cursor = conn.cursor()

cursor.execute("SELECT id, script, interval, time, last_run FROM Your_Table_Name")

tasks = cursor.fetchall()

conn.close()

return tasks

Update the last run time

The update_last_run function will update the table in your database to reflect the last time it ran that specific script. Youwill need to update this part of the code with your table name.

def update_last_run(task_id):

conn = pyodbc.connect(CONN_STR)

cursor = conn.cursor()

cursor.execute("UPDATE Your_Table_Name SET last_run = ? WHERE id = ?", (datetime.now(), task_id))

conn.commit()

conn.close()

Running a task

The run_task function will run a script and then call the update_last_run function. If there is a problem, it will print an error message.

def run_task(script, task_id):

try:

exec(open(script).read())

update_last_run(task_id)

except Exception as e:

print(f"Error running script {script}: {e}")

Setup schedule

The setup_schedule function is responsible for running the scripts at specific times, whether it be daily, every hour, or every so many hours.

def setup_schedule():

tasks = load_tasks()

for task in tasks:

task_id, script, interval, time_str, last_run = task

if interval == "daily":

schedule.every().day.at(time_str).do(run_task, script, task_id)

elif interval == "hourly":

schedule.every().hour.do(run_task, script, task_id)

elif interval.startswith("every_"): # Handle custom intervals

hours = int(interval.split("_").split("_"))

schedule.every(hours).hours.do(run_task, script, task_id)

Top modern programming languages driving the future of development

Main Loop

The main loop is responsible for making the entire script run. It calls the setup_scedule function to load the tasks and checks and runs any due to execute. The time.sleep function tells the script how long to wait before automatically running again. In this case, it will run every second. Increasing the number will slow it down and make it easier on system resources, but it will sacrifice timing accuracy.

if __name__ == "__main__":

setup_schedule()

print("Scheduler is running...")

while True:

schedule.run_pending()

time.sleep(1)

Complete Script

import schedule

import time

from datetime import datetime

import pyodbc

# Database connection string

DB_PATH = r"C:\Path\To\TaskScheduler.accdb" # Update with your file path

CONN_STR = f"DRIVER={{Microsoft Access Driver (*.mdb, *.accdb)}};DBQ={DB_PATH};"

# Function to get tasks from the database

def load_tasks():

conn = pyodbc.connect(CONN_STR)

cursor = conn.cursor()

cursor.execute("SELECT id, script, interval, time, last_run FROM Your_Table_Name")

tasks = cursor.fetchall()

conn.close()

return tasks

# Function to update the last run time

def update_last_run(task_id):

conn = pyodbc.connect(CONN_STR)

cursor = conn.cursor()

cursor.execute("UPDATE Your_Table_Name SET last_run = ? WHERE id = ?", (datetime.now(), task_id))

conn.commit()

conn.close()

# Function to dynamically run scripts

def run_task(script, task_id):

try:

exec(open(script).read())

update_last_run(task_id)

except Exception as e:

print(f"Error running script {script}: {e}")

# Schedule tasks dynamically

def setup_schedule():

tasks = load_tasks()

for task in tasks:

task_id, script, interval, time_str, last_run = task

if interval == "daily":

schedule.every().day.at(time_str).do(run_task, script, task_id)

elif interval == "hourly":

schedule.every().hour.do(run_task, script, task_id)

elif interval.startswith("every_"): # Handle custom intervals

hours = int(interval.split("_").split("_"))

schedule.every(hours).hours.do(run_task, script, task_id)

# Main scheduler loop

if __name__ == "__main__":

setup_schedule()

print("Scheduler is running...")

while True:

schedule.run_pending()

time.sleep(1)

Summary

Paste this script into a new file created in the src folder we created in the last guide. Update your database with files you want to run at specific times, and then let the script run to see it in action.

This script will be vital for keeping everything running on time with our AI project, but you can start using it to do things automatically right now. For instance, you can use it to make posts on Facebook and X or check your internet speed regularly. We already have these scripts here at GeekSided, and you can plug them right into your database to start running automatically.

Next time, we’ll start creating a script to track our files, which we will use with our AI!

Follow GeekSided for more fun DIY projects.