Your task in this assignment is to create a simple search program that implements BWT backward search, which can efficiently search a BWT encoded file. The program also has the capability to decode the BWT encoded file back to its original file in a lossless manner. The original file (before BWT) contains text in multiple lines and may include characters with ASCII values 10 and 32 to 126 (inclusively). There will be at least one newline character just before the end of file (to make sure that every line of text is ended with a newline character). Each line begins with a line number. All line numbers of a file have the same number of digits (packed with leading zeros as appropriate). Sample input files (BWT encoded) have been provided in the folder ~cs9319/a2. To help in testing your program, the corresponding original files (after reverse BWT) are also included in that folder. These original files will not be available when we test your submission. Although you do not need to submit a BWT encoder, it is recommended that you will implement a simple BWT encoding program (this will help you in understanding the lecture materials and assist in testing your assignment). Your C/C++ program, called bwtsearch, accepts an optional flag; the path to a BWT encoded file; the path to an index file; and one quoted, non-empty query string as commandline input arguments. Using the given query string, it will perform a backward search on the given BWT encoded file and output all the lines that contain the input query string to the standard output. The search results are sorted according to their line numbers in ascending order. You may assume that the query string does not contain any newline characters. Since each line is ended with a newline character, your output will naturally be displayed as one line (ending with a '\n') for each match. No line will be output more than once, i.e., if there are multiple matches in one line, that line will only be output once. If an optional flag -m is specified, given a query string, bwtsearch will output the total number of matching substrings (count duplicates) to the standard output. The output is the total number, with an ending newline character. Similarly, bwtsearch will output the total number of unique matching records (do not count duplicates) if -n is specified. If -o is specified, no query string is required in the commandline. Instead, a output filename needs to be specified, a reverse BWT is performed and the original file (before BWT) is output (created, or overwritten if already exists) to the specified output file. Your solution can write out one external index file at any time that is no larger than half of the size of the given input BWT file, plus an extra 2048 bytes. If your index file is larger than half of the size of the input BWT file plus 2048 bytes, you will receive zero points for the tests that use that file. You may assume that the index file will not be deleted during all the tests for a given BWT file, and all the test BWT files are uniquely named. Therefore, to save time, you only need to generate the index file when it does not exist yet. 7/27/2021 COMP9319 2021T2 Assignment 2 http://www.cse.unsw.edu.au/~wong/cs9319-21a2.html 2/6 Example Consider one of the given sample files tiny.bwt. You can find the file by logging into CSE machines and going to folder ~cs9319/a2. The original file (tiny.txt) before BWT is also provided in that folder. The following are some examples using tiny.bwt. %wagner> bwtsearch -m ~cs9319/a2/tiny.bwt ./tiny.idx "ana" 2 %wagner> bwtsearch -n ~cs9319/a2/tiny.bwt ./tiny.idx "ana" 1 %wagner> bwtsearch ~cs9319/a2/tiny.bwt ./tiny.idx "ana" 01 banana %wagner> bwtsearch -m ~cs9319/a2/tiny.bwt ./tiny.idx "erry" 3 %wagner> bwtsearch -n ~cs9319/a2/tiny.bwt ./tiny.idx "erry" 3 %wagner> bwtsearch ~cs9319/a2/tiny.bwt ./tiny.idx "erry" 10 raspberry 11 cherry 13 blackberry %wagner> bwtsearch -o ~cs9319/a2/tiny.bwt ./tiny.idx tiny.decoded.txt %wagner> diff ~cs9319/a2/tiny.txt tiny.decoded.txt %wagner> More examples: %wagner> bwtsearch -n ~cs9319/a2/6MB.bwt ./6MB.idx "compilers" 1 %wagner> bwtsearch -m ~cs9319/a2/6MB.bwt ./6MB.idx "compilers" 1 %wagner> bwtsearch -m ~cs9319/a2/6MB.bwt ./6MB.idx "compiler" 8 %wagner> bwtsearch -n ~cs9319/a2/6MB.bwt ./6MB.idx "compiler" 8 %wagner> bwtsearch ~cs9319/a2/6MB.bwt ./6MB.idx "compiler" 020266 Verification of the C0 compiler implementation on the source code level. 021834 A framework for intelligent speculative compiler optimizations and its application to memory accesses. 035239 A compiler backend for generic programming with arrays. 044877 C compiler aided design of application specific instruction set processors using the machine description language LISA. 052175 Semantics-directed generation of compilers and abstract machines 123034 Evaluating compiler technology for control-flow optimizations for multimedia extension architectures. 123777 Design and implementation of a queue compiler. 125902 Hardware-compiler co-design for adjustable data power savings. %wagner> %wagner> bwtsearch ~cs9319/a2/sherlock.bwt ./SL.index ",000 pounds" 002001 the stake will be some 30,000 pounds; and for you, Jones, it will 004503 of some 14,000 pounds, which lay to his credit at the bank." 7/27/2021 COMP9319 2021T2 Assignment 2 http://www.cse.unsw.edu.au/~wong/cs9319-21a2.html 3/6 010537 50,000 pounds at once. I could, of course, borrow so trifling a %wagner> bwtsearch -m ~cs9319/a2/sherlock.bwt ./SL.index ",000 pounds" 3 %wagner> bwtsearch -n ~cs9319/a2/sherlock.bwt ./SL.index ",000 pounds" 3 %wagner> We will use the make command below to compile your solution. Please provide a makefile
北美代写,Homework代写,Essay代寫-准时✔️高质✔最【靠谱】Note: Be sure to adhere to the University’s Policy on Academic Integrity as discussed in class. Programming assignments are to be written individually and submitted programs must be the result of your own efforts. Any suspicion of academic integrity violation will be dealt with accordingly. Objective: Expressing the problem as a search problem and identifying proper solving methods. Specifying, designing and implementing uninformed & informed search methods. Assignment Details: In this programming assignment you will implement a set of search algorithms (BFS, DFS, Greedy, A*) to find solution to the sliding puzzle problem: Theoretically solving the problems as a search process includes progressive construction of a solution: ? Establish the problem's components o Initial state o Final state o Operators (successor functions) o Solution ? Defining the search space ? Establish the strategy for search a solution into the search space. Let’s start by defining the sliding puzzle problem: For a given puzzle of n x n squares with numbers from 1 to (n x n-1) (one square is empty) in an initial configuration, find a sequence of movements for the numbers in order to reach a final given configuration, knowing that a number can move (horizontally or vertically) on an adjacent empty square. You will solve the puzzle for size n = 2 (2 x 2 squares), 3 (3 x 3 squares) and 4 (4 x 4 squares). The final configuration/goal state for each puzzle of size n is as follows: Important: Solvability of the sliding puzzle problem Not all initial boards can lead to the goal board by a sequence of moves, including these two: Remarkably, we can determine whether a board is solvable without solving it! To do so, we count inversions, as described next. ? Inversions: Given a board, an inversion is any pair of tiles i and j where i < j but i appears after j when considering the board in row-major order (row 0, followed by row 1, and so forth). ? Odd-sized boards: If a board has an odd number of inversions, it is unsolvable because the goal board has an even number (zero) of inversions. It turns out that the converse is also true: if a board has an even number of inversions, then it is solvable. In summary, when n is odd, an n-by-n board is solvable if and only if its number of inversions is even. ? Even-sized boards: Now, we’ll consider the case when the board size n is an even integer. In this case, the parity of the number of inversions is not invariant. However, the parity of the number of inversions plus the row of the blank square (indexed starting at 0) is invariant: each move changes this sum by an even number. That is, when n is even, an n-by-n board is solvable if and only if the number of inversions plus the row of the blank square is odd. Assignment Summary: The search algorithms you are expected to implement are: ? Breadth-first search (BFS) ? Depth-first search (DFS) – depth-first search needs to check for cycles, or a solution will most likely never be found. ? Greedy best-first search (GBFS), using Manhattan Distance as the heuristic. ? A* (AStar) using Manhattan Distance (discussed during lecture session) as the heuristic. Input: Your program will accept instructions from the command line. The program should accept the following inputs in the following format: ? Sliding puzzle problem: [size] “[initialstate]” [searchmethod] o [size] of sliding puzzle problem (2,3 or 4) o [initialstate] must characters, namely the digits 1-9, letters A-F (only for size 4) and a space, in any order. Make sure the initial state is solvable from the given goal state. o [searchmethod] can be: BFS, DFS, GBFS, AStar. o Examples: § 2 “ 321” DFS § 3 “47315862 ” GBFS § 4 “123456789AB DEFC” BFS Output: Your program will generate 2 different outputs – one to the console & and another to a readme file. ? Console Output: Show solution path to the sliding puzzle. Remember planning is part of the simulation process and on console (to user) we simply want to show the final solution path from initial state to the goal state for each search method. ? Readme.txt: The Readme should include – o You need to include size, initial and goal state of the problem. o And for each state and size of the problem, include searchmethod and report a comma-separated list of integers listed in the following format. This format should represent the state of your search tree at the moment the solution was discovered: [depth], [numCreated], [numExpanded], [maxFringe] o [depth] represents the depth in the search tree where the solution is found. The integer will be zero if the solution is at the root and it will be “-1” if a solution was not found. o [numCreated] is the counter that is incremented every time a node of the search tree is created (output 0 if depth == -1). o [numExpanded] is the counter that will be incremented every time the search algorithm acquires the successor states to the current state, i.e., every time a node is pulled off the fringe and found not to be the solution (output 0 if depth == -1). o [maxFringe] is the maximum size of the fringe at any point during the search (output 0 if depth == -1). Hints: ? The successors of a state in this problem depend greatly on the position of the blank spot. Rather than think about which tiles can move into the blank spot, try considering where the blank spot can move. Certain numerical qualities about this position will determine whether or not the blank can move left, right, up, or down. If a blank spot moves up, then its location is being swapped with the location above it. ? The heuristics can be generated by comparing the state being evaluated to the goal state. The number of misplaced tiles is easily calculated in time linear to the number of tiles in the puzzle, but the simple solution to the Manhattan distance requires time quadratic to the number of tiles in the puzzle. ? If you plan to start from scratch, spend a lot of time thinking about what data structures you will use for your fringe. Much of the variation between the algorithms comes in how to determine which elements to pull off the fringe next. ? Many of these algorithms are more similar than different. Think about how you can use polymorphism & fringe (one queue/list for all) to make your life easier. Submission Guidelines: Zip and upload the following files on Canvas using the Programming Assignment 1 submission link: ? Board.java/Board.py: Source code that models an n-by-n board with sliding tiles. Your heuristic function resides in this file. ? Solver.java/Solver.py: Source code that implements BFS, DFS, GBFS, AStar to solve n-by-n sliding puzzle. ? Tester.java/Tester.py: A tester file that connects your Board and Solver code. ? Readme.txt: Output for each of the algorithm and problem size in the format explained in the output section. Please refer to the grading rubric for breakdown of the points.